keras/models.md at 91a15fdf43d17d005c82c3dc649baa8f6ff06c36

2015-07-05 21:05:56 -07:00

11 KiB

Raw Blame History

Sequential

Linear stack of layers.

model = keras.models.Sequential()

Methods:
- add(layer): Add a layer to the model.
- compile(optimizer, loss, class_mode="categorical"):
  - Arguments:
    - optimizer: str (name of optimizer) or optimizer object. See optimizers.
    - loss: str (name of objective function) or objective function. See objectives.
    - class_mode: one of "categorical", "binary". This is only used for computing classification accuracy or using the predict_classes method.
    - theano_mode: A theano.compile.mode.Mode (reference) instance controlling specifying compilation options.
- fit(X, y, batch_size=128, nb_epoch=100, verbose=1, validation_split=0., validation_data=None, shuffle=True, show_accuracy=False, callbacks=[], class_weight=None, sample_weight=None): Train a model for a fixed number of epochs.
  - Return: a history dictionary with a record of training loss values at successive epochs, as well as validation loss values (if applicable), accuracy (if applicable), etc.
  - Arguments:
    - X: data.
    - y: labels.
    - batch_size: int. Number of samples per gradient update.
    - nb_epoch: int.
    - verbose: 0 for no logging to stdout, 1 for progress bar logging, 2 for one log line per epoch.
    - callbacks: keras.callbacks.Callback list. List of callbacks to apply during training. See callbacks.
    - validation_split: float (0. < x < 1). Fraction of the data to use as held-out validation data.
    - validation_data: tuple (X, y) to be used as held-out validation data. Will override validation_split.
    - shuffle: boolean. Whether to shuffle the samples at each epoch.
    - show_accuracy: boolean. Whether to display class accuracy in the logs to stdout at each epoch.
    - class_weight: dictionary mapping classes to a weight value, used for scaling the loss function (during training only).
    - sample_weight: list or numpy array with 1:1 mapping to the training samples, used for scaling the loss function (during training only).
- evaluate(X, y, batch_size=128, show_accuracy=False, verbose=1): Show performance of the model over some validation data.
  - Return: The loss score over the data, or a (loss, accuracy) tuple if show_accuracy=True.
  - Arguments: Same meaning as fit method above. verbose is used as a binary flag (progress bar or nothing).
- predict(X, batch_size=128, verbose=1):
  - Return: An array of predictions for some test data.
  - Arguments: Same meaning as fit method above.
- predict_classes(X, batch_size=128, verbose=1): Return an array of class predictions for some test data.
  - Return: An array of labels for some test data.
  - Arguments: Same meaning as fit method above. verbose is used as a binary flag (progress bar or nothing).
- train_on_batch(X, y, accuracy=False): Single gradient update on one batch.
  - Return: loss over the data, or tuple (loss, accuracy) if accuracy=True.
- test_on_batch(X, y, accuracy=False): Single performance evaluation on one batch.
  - Return: loss over the data, or tuple (loss, accuracy) if accuracy=True.
- save_weights(fname, overwrite=False): Store the weights of all layers to a HDF5 file. If overwrite==False and the file already exists, an exception will be thrown.
- load_weights(fname): Sets the weights of a model, based to weights stored by save_weights. You can only load_weights on a savefile from a model with an identical architecture. load_weights can be called either before or after the compile step.

Examples:

from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD

model = Sequential()
model.add(Dense(64, 2, init='uniform'))
model.add(Activation('softmax'))

model.compile(loss='mse', optimizer='sgd')

'''
Demonstration of verbose modes 1 and 2
'''
model.fit(X_train, y_train, nb_epoch=3, batch_size=16, verbose=1)
# outputs
'''
Train on 37800 samples, validate on 4200 samples
Epoch 0
37800/37800 [==============================] - 7s - loss: 0.0385
Epoch 1
37800/37800 [==============================] - 8s - loss: 0.0140
Epoch 2
10960/37800 [=======>......................] - ETA: 4s - loss: 0.0109
'''

model.fit(X_train, y_train, nb_epoch=3, batch_size=16, verbose=2)
# outputs
'''
Train on 37800 samples, validate on 4200 samples
Epoch 0
loss: 0.0190
Epoch 1
loss: 0.0146
Epoch 2
loss: 0.0049
'''

'''
Demonstration of show_accuracy
'''
model.fit(X_train, y_train, nb_epoch=3, batch_size=16, verbose=2, show_accuracy=True)
# outputs
'''
Train on 37800 samples, validate on 4200 samples
Epoch 0
loss: 0.0190 - acc.: 0.8750
Epoch 1
loss: 0.0146 - acc.: 0.8750
Epoch 2
loss: 0.0049 - acc.: 1.0000
'''

'''
Demonstration of validation_split
'''
model.fit(X_train, y_train, nb_epoch=3, batch_size=16, validation_split=0.1, show_accuracy=True, verbose=1)
# outputs
'''
Train on 37800 samples, validate on 4200 samples
Epoch 0
37800/37800 [==============================] - 7s - loss: 0.0385 - acc.: 0.7258 - val. loss: 0.0160 - val. acc.: 0.9136
Epoch 1
37800/37800 [==============================] - 8s - loss: 0.0140 - acc.: 0.9265 - val. loss: 0.0109 - val. acc.: 0.9383
Epoch 2
10960/37800 [=======>......................] - ETA: 4s - loss: 0.0109 - acc.: 0.9420
'''

Graph

Arbitrary connection graph. It can have any number of inputs and outputs, with each output trained with its own loss function. The quantity being optimized by a Graph model is the sum of all loss functions over the different outputs.

model = keras.models.Graph()

Methods:
- add_input(name, ndim=2, dtype='float'): Add an input with shape dimensionality ndim.
  - Arguments:
    - ndim: Use ndim=2 for vector input (samples, features), ndim=3 for temporal input (samples, time, features), ndim=4 for image input (samples, channels, height, width).
    - dtype: float or int. Use int if the input is connected to an Embedding layer, float otherwise.
- add_output(name, input=None, inputs=[], merge_mode='concat'): Add an output connect to input or inputs.
  - Arguments:
    - name: str. unique identifier of the output.
    - input: str name of the node that the output is connected to. Only specify one of either input or inputs.
    - inputs: list of str names of the node that the output is connected to.
    - merge_mode: "sum" or "concat". Only applicable if inputs list is specified. Merge mode for the different inputs.
- add_node(layer, name, input=None, inputs=[], merge_mode='concat'): Add an output connect to input or inputs.
  - Arguments:
    - layer: Layer instance.
    - name: str. unique identifier of the node.
    - input: str name of the node/input that the node is connected to. Only specify one of either input or inputs.
    - inputs: list of str names of the node that the node is connected to.
    - merge_mode: "sum" or "concat". Only applicable if inputs list is specified. Merge mode for the different inputs.
- compile(optimizer, loss):
  - Arguments:
    - optimizer: str (name of optimizer) or optimizer object. See optimizers.
    - loss: dictionary mapping the name(s) of the output(s) to a loss function (string name of objective function or objective function. See objectives).
- fit(data, batch_size=128, nb_epoch=100, verbose=1, validation_split=0., validation_data=None, shuffle=True, callbacks=[]): Train a model for a fixed number of epochs.
  - Return: a history dictionary with a record of training loss values at successive epochs, as well as validation loss values (if applicable).
  - Arguments:
    - data:dictionary mapping input names out outputs names to appropriate numpy arrays. All arrays should contain the same number of samples.
    - batch_size: int. Number of samples per gradient update.
    - nb_epoch: int.
    - verbose: 0 for no logging to stdout, 1 for progress bar logging, 2 for one log line per epoch.
    - callbacks: keras.callbacks.Callback list. List of callbacks to apply during training. See callbacks.
    - validation_split: float (0. < x < 1). Fraction of the data to use as held-out validation data.
    - validation_data: tuple (X, y) to be used as held-out validation data. Will override validation_split.
    - shuffle: boolean. Whether to shuffle the samples at each epoch.
- evaluate(data, batch_size=128, verbose=1): Show performance of the model over some validation data.
  - Return: The loss score over the data.
  - Arguments: Same meaning as fit method above. verbose is used as a binary flag (progress bar or nothing).
- predict(data, batch_size=128, verbose=1):
  - Return: A dictionary mapping output names to arrays of predictions over the data.
  - Arguments: Same meaning as fit method above. Only inputs need to be specified in data.
- train_on_batch(data): Single gradient update on one batch.
  - Return: loss over the data.
- test_on_batch(data): Single performance evaluation on one batch.
  - Return: loss over the data.
- save_weights(fname, overwrite=False): Store the weights of all layers to a HDF5 file. If overwrite==False and the file already exists, an exception will be thrown.
- load_weights(fname): Sets the weights of a model, based to weights stored by save_weights. You can only load_weights on a savefile from a model with an identical architecture. load_weights can be called either before or after the compile step.

Examples:

# graph model with one input and two outputs
graph = Graph()
graph.add_input(name='input', ndim=2)
graph.add_node(Dense(32, 16), name='dense1', input='input')
graph.add_node(Dense(32, 4), name='dense2', input='input')
graph.add_node(Dense(16, 4), name='dense3', input='dense1')
graph.add_output(name='output1', input='dense2')
graph.add_output(name='output2', input='dense3')

graph.compile('rmsprop', {'output1':'mse', 'output2':'mse'})
history = graph.fit({'input':X_train, 'output1':y_train, 'output2':y2_train}, nb_epoch=10)

# graph model with two inputs and one output
graph = Graph()
graph.add_input(name='input1', ndim=2)
graph.add_input(name='input2', ndim=2)
graph.add_node(Dense(32, 16), name='dense1', input='input1')
graph.add_node(Dense(32, 4), name='dense2', input='input2')
graph.add_node(Dense(16, 4), name='dense3', input='dense1')
graph.add_output(name='output', inputs=['dense2', 'dense3'], merge_mode='sum')
graph.compile('rmsprop', {'output':'mse'})

history = graph.fit({'input1':X_train, 'input2':X2_train, 'output':y_train}, nb_epoch=10)
predictions = graph.predict({'input1':X_test, 'input2':X2_test}) # {'output':...}

11 KiB Raw Blame History

Sequential

Graph

11 KiB

Raw Blame History