keras/docs/sources/models.md
2015-07-05 21:05:56 -07:00

11 KiB

Sequential

Linear stack of layers.

model = keras.models.Sequential()
  • Methods:
    • add(layer): Add a layer to the model.
    • compile(optimizer, loss, class_mode="categorical"):
      • Arguments:
        • optimizer: str (name of optimizer) or optimizer object. See optimizers.
        • loss: str (name of objective function) or objective function. See objectives.
        • class_mode: one of "categorical", "binary". This is only used for computing classification accuracy or using the predict_classes method.
        • theano_mode: A theano.compile.mode.Mode (reference) instance controlling specifying compilation options.
    • fit(X, y, batch_size=128, nb_epoch=100, verbose=1, validation_split=0., validation_data=None, shuffle=True, show_accuracy=False, callbacks=[], class_weight=None, sample_weight=None): Train a model for a fixed number of epochs.
      • Return: a history dictionary with a record of training loss values at successive epochs, as well as validation loss values (if applicable), accuracy (if applicable), etc.
      • Arguments:
        • X: data.
        • y: labels.
        • batch_size: int. Number of samples per gradient update.
        • nb_epoch: int.
        • verbose: 0 for no logging to stdout, 1 for progress bar logging, 2 for one log line per epoch.
        • callbacks: keras.callbacks.Callback list. List of callbacks to apply during training. See callbacks.
        • validation_split: float (0. < x < 1). Fraction of the data to use as held-out validation data.
        • validation_data: tuple (X, y) to be used as held-out validation data. Will override validation_split.
        • shuffle: boolean. Whether to shuffle the samples at each epoch.
        • show_accuracy: boolean. Whether to display class accuracy in the logs to stdout at each epoch.
        • class_weight: dictionary mapping classes to a weight value, used for scaling the loss function (during training only).
        • sample_weight: list or numpy array with 1:1 mapping to the training samples, used for scaling the loss function (during training only).
    • evaluate(X, y, batch_size=128, show_accuracy=False, verbose=1): Show performance of the model over some validation data.
      • Return: The loss score over the data, or a (loss, accuracy) tuple if show_accuracy=True.
      • Arguments: Same meaning as fit method above. verbose is used as a binary flag (progress bar or nothing).
    • predict(X, batch_size=128, verbose=1):
      • Return: An array of predictions for some test data.
      • Arguments: Same meaning as fit method above.
    • predict_classes(X, batch_size=128, verbose=1): Return an array of class predictions for some test data.
      • Return: An array of labels for some test data.
      • Arguments: Same meaning as fit method above. verbose is used as a binary flag (progress bar or nothing).
    • train_on_batch(X, y, accuracy=False): Single gradient update on one batch.
      • Return: loss over the data, or tuple (loss, accuracy) if accuracy=True.
    • test_on_batch(X, y, accuracy=False): Single performance evaluation on one batch.
      • Return: loss over the data, or tuple (loss, accuracy) if accuracy=True.
    • save_weights(fname, overwrite=False): Store the weights of all layers to a HDF5 file. If overwrite==False and the file already exists, an exception will be thrown.
    • load_weights(fname): Sets the weights of a model, based to weights stored by save_weights. You can only load_weights on a savefile from a model with an identical architecture. load_weights can be called either before or after the compile step.

Examples:

from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import SGD

model = Sequential()
model.add(Dense(64, 2, init='uniform'))
model.add(Activation('softmax'))

model.compile(loss='mse', optimizer='sgd')

'''
Demonstration of verbose modes 1 and 2
'''
model.fit(X_train, y_train, nb_epoch=3, batch_size=16, verbose=1)
# outputs
'''
Train on 37800 samples, validate on 4200 samples
Epoch 0
37800/37800 [==============================] - 7s - loss: 0.0385
Epoch 1
37800/37800 [==============================] - 8s - loss: 0.0140
Epoch 2
10960/37800 [=======>......................] - ETA: 4s - loss: 0.0109
'''

model.fit(X_train, y_train, nb_epoch=3, batch_size=16, verbose=2)
# outputs
'''
Train on 37800 samples, validate on 4200 samples
Epoch 0
loss: 0.0190
Epoch 1
loss: 0.0146
Epoch 2
loss: 0.0049
'''

'''
Demonstration of show_accuracy
'''
model.fit(X_train, y_train, nb_epoch=3, batch_size=16, verbose=2, show_accuracy=True)
# outputs
'''
Train on 37800 samples, validate on 4200 samples
Epoch 0
loss: 0.0190 - acc.: 0.8750
Epoch 1
loss: 0.0146 - acc.: 0.8750
Epoch 2
loss: 0.0049 - acc.: 1.0000
'''

'''
Demonstration of validation_split
'''
model.fit(X_train, y_train, nb_epoch=3, batch_size=16, validation_split=0.1, show_accuracy=True, verbose=1)
# outputs
'''
Train on 37800 samples, validate on 4200 samples
Epoch 0
37800/37800 [==============================] - 7s - loss: 0.0385 - acc.: 0.7258 - val. loss: 0.0160 - val. acc.: 0.9136
Epoch 1
37800/37800 [==============================] - 8s - loss: 0.0140 - acc.: 0.9265 - val. loss: 0.0109 - val. acc.: 0.9383
Epoch 2
10960/37800 [=======>......................] - ETA: 4s - loss: 0.0109 - acc.: 0.9420
'''

Graph

Arbitrary connection graph. It can have any number of inputs and outputs, with each output trained with its own loss function. The quantity being optimized by a Graph model is the sum of all loss functions over the different outputs.

model = keras.models.Graph()
  • Methods:
    • add_input(name, ndim=2, dtype='float'): Add an input with shape dimensionality ndim.
      • Arguments:
        • ndim: Use ndim=2 for vector input (samples, features), ndim=3 for temporal input (samples, time, features), ndim=4 for image input (samples, channels, height, width).
        • dtype: float or int. Use int if the input is connected to an Embedding layer, float otherwise.
    • add_output(name, input=None, inputs=[], merge_mode='concat'): Add an output connect to input or inputs.
      • Arguments:
        • name: str. unique identifier of the output.
        • input: str name of the node that the output is connected to. Only specify one of either input or inputs.
        • inputs: list of str names of the node that the output is connected to.
        • merge_mode: "sum" or "concat". Only applicable if inputs list is specified. Merge mode for the different inputs.
    • add_node(layer, name, input=None, inputs=[], merge_mode='concat'): Add an output connect to input or inputs.
      • Arguments:
        • layer: Layer instance.
        • name: str. unique identifier of the node.
        • input: str name of the node/input that the node is connected to. Only specify one of either input or inputs.
        • inputs: list of str names of the node that the node is connected to.
        • merge_mode: "sum" or "concat". Only applicable if inputs list is specified. Merge mode for the different inputs.
    • compile(optimizer, loss):
      • Arguments:
        • optimizer: str (name of optimizer) or optimizer object. See optimizers.
        • loss: dictionary mapping the name(s) of the output(s) to a loss function (string name of objective function or objective function. See objectives).
    • fit(data, batch_size=128, nb_epoch=100, verbose=1, validation_split=0., validation_data=None, shuffle=True, callbacks=[]): Train a model for a fixed number of epochs.
      • Return: a history dictionary with a record of training loss values at successive epochs, as well as validation loss values (if applicable).
      • Arguments:
        • data:dictionary mapping input names out outputs names to appropriate numpy arrays. All arrays should contain the same number of samples.
        • batch_size: int. Number of samples per gradient update.
        • nb_epoch: int.
        • verbose: 0 for no logging to stdout, 1 for progress bar logging, 2 for one log line per epoch.
        • callbacks: keras.callbacks.Callback list. List of callbacks to apply during training. See callbacks.
        • validation_split: float (0. < x < 1). Fraction of the data to use as held-out validation data.
        • validation_data: tuple (X, y) to be used as held-out validation data. Will override validation_split.
        • shuffle: boolean. Whether to shuffle the samples at each epoch.
    • evaluate(data, batch_size=128, verbose=1): Show performance of the model over some validation data.
      • Return: The loss score over the data.
      • Arguments: Same meaning as fit method above. verbose is used as a binary flag (progress bar or nothing).
    • predict(data, batch_size=128, verbose=1):
      • Return: A dictionary mapping output names to arrays of predictions over the data.
      • Arguments: Same meaning as fit method above. Only inputs need to be specified in data.
    • train_on_batch(data): Single gradient update on one batch.
      • Return: loss over the data.
    • test_on_batch(data): Single performance evaluation on one batch.
      • Return: loss over the data.
    • save_weights(fname, overwrite=False): Store the weights of all layers to a HDF5 file. If overwrite==False and the file already exists, an exception will be thrown.
    • load_weights(fname): Sets the weights of a model, based to weights stored by save_weights. You can only load_weights on a savefile from a model with an identical architecture. load_weights can be called either before or after the compile step.

Examples:

# graph model with one input and two outputs
graph = Graph()
graph.add_input(name='input', ndim=2)
graph.add_node(Dense(32, 16), name='dense1', input='input')
graph.add_node(Dense(32, 4), name='dense2', input='input')
graph.add_node(Dense(16, 4), name='dense3', input='dense1')
graph.add_output(name='output1', input='dense2')
graph.add_output(name='output2', input='dense3')

graph.compile('rmsprop', {'output1':'mse', 'output2':'mse'})
history = graph.fit({'input':X_train, 'output1':y_train, 'output2':y2_train}, nb_epoch=10)

# graph model with two inputs and one output
graph = Graph()
graph.add_input(name='input1', ndim=2)
graph.add_input(name='input2', ndim=2)
graph.add_node(Dense(32, 16), name='dense1', input='input1')
graph.add_node(Dense(32, 4), name='dense2', input='input2')
graph.add_node(Dense(16, 4), name='dense3', input='dense1')
graph.add_output(name='output', inputs=['dense2', 'dense3'], merge_mode='sum')
graph.compile('rmsprop', {'output':'mse'})

history = graph.fit({'input1':X_train, 'input2':X2_train, 'output':y_train}, nb_epoch=10)
predictions = graph.predict({'input1':X_test, 'input2':X2_test}) # {'output':...}