diff --git a/docs/sources/constraints.md b/docs/sources/constraints.md index 3fdbc3e79..867bca51a 100644 --- a/docs/sources/constraints.md +++ b/docs/sources/constraints.md @@ -1,20 +1,18 @@ -#Constraints - ## Usage of constraints -Regularizers allow the use of penalties on particular sets of parameters during optimization. +Functions from the `constraints` module allow setting constraints (eg. non-negativity) on network parameters during optimization. -A constraint is initilized with the value of the constraint. -For example `maxnorm(3)` will constrain the weight vector to each hidden unit to have a maximum norm of 3. The keyword arguments used for passing constraints to parameters in a layer will depend on the layer. -For weights in the `Dense` layer it is simply `W_constraint`. -For biases in the `Dense` layer it is simply `b_constraint`. + +In the `Dense` layer it is simply `W_constraint` for the main weights matrix, and `b_constraint` for the bias. + ```python +from keras.constraints import maxnorm model.add(Dense(64, 64, W_constraint = maxnorm(2))) ``` ## Available constraints -- __maxnorm__: maximum-norm constraint -- __nonneg__: non-negative constraint \ No newline at end of file +- __maxnorm__(m=2): maximum-norm constraint +- __nonneg__(): non-negativity constraint \ No newline at end of file diff --git a/docs/sources/layers/core.md b/docs/sources/layers/core.md index 0f897e660..8e76b231d 100644 --- a/docs/sources/layers/core.md +++ b/docs/sources/layers/core.md @@ -70,7 +70,8 @@ Set the weights of the parameters of the layer. ## Dense ```python -keras.layers.core.Dense(input_dim, output_dim, init='glorot_uniform', activation='linear', weights=None) +keras.layers.core.Dense(input_dim, output_dim, init='glorot_uniform', activation='linear', weights=None \ +W_regularizer=None, b_regularizer=None, W_constraint=None, b_constraint=None) ``` Standard 1D fully-connect layer. @@ -86,12 +87,17 @@ Standard 1D fully-connect layer. - __init__: name of initialization function for the weights of the layer (see: [initializations](../initializations.md)), or alternatively, Theano function to use for weights initialization. This parameter is only relevant if you don't pass a `weights` argument. - __activation__: name of activation function to use (see: [activations](../activations.md)), or alternatively, elementwise Theano function. If you don't specify anything, no activation is applied (ie. "linear" activation: a(x) = x). - __weights__: list of numpy arrays to set as initial weights. The list should have 1 element, of shape `(input_dim, output_dim)`. + - __W_regularizer__: instance of the [regularizers](../regularizers.md) module (eg. L1 or L2 regularization), applied to the main weights matrix. + - __b_regularizer__: instance of the [regularizers](../regularizers.md) module, applied to the bias. + - __W_constraint__: instance of the [constraints](../constraints.md) module (eg. maxnorm, nonneg), applied to the main weights matrix. + - __b_constraint__: instance of the [constraints](../constraints.md) module, applied to the bias. --- ## TimeDistributedDense ```python -keras.layers.core.TimeDistributedDense(input_dim, output_dim, init='glorot_uniform', activation='linear', weights=None) +keras.layers.core.TimeDistributedDense(input_dim, output_dim, init='glorot_uniform', activation='linear', weights=None \ +W_regularizer=None, b_regularizer=None, W_constraint=None, b_constraint=None) ``` Fully-connected layer distributed over the time dimension. Useful after a recurrent network set to `return_sequences=True`. @@ -104,6 +110,10 @@ Fully-connected layer distributed over the time dimension. Useful after a recurr - __init__: name of initialization function for the weights of the layer (see: [initializations](../initializations.md)), or alternatively, Theano function to use for weights initialization. This parameter is only relevant if you don't pass a `weights` argument. - __activation__: name of activation function to use (see: [activations](../activations.md)), or alternatively, elementwise Theano function. If you don't specify anything, no activation is applied (ie. "linear" activation: a(x) = x). - __weights__: list of numpy arrays to set as initial weights. The list should have 1 element, of shape `(input_dim, output_dim)`. + - __W_regularizer__: instance of the [regularizers](../regularizers.md) module (eg. L1 or L2 regularization), applied to the main weights matrix. + - __b_regularizer__: instance of the [regularizers](../regularizers.md) module, applied to the bias. + - __W_constraint__: instance of the [constraints](../constraints.md) module (eg. maxnorm, nonneg), applied to the main weights matrix. + - __b_constraint__: instance of the [constraints](../constraints.md) module, applied to the bias. - __Example__: ```python @@ -138,7 +148,7 @@ keras.layers.core.Dropout(p) ``` Apply dropout to the input. Dropout consists in randomly setting a fraction `p` of input units to 0 at each update during training time, which helps prevent overfitting. Reference: [Dropout: A Simple Way to Prevent Neural Networks from Overfitting](http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf) -- __Input shape__: This layer does not assume a specific input shape. As a result, it cannot be used as the first layer in a model. +- __Input shape__: This layer does not assume a specific input shape. - __Output shape__: Same as input. @@ -156,7 +166,7 @@ keras.layers.core.Reshape(*dims) Reshape the input to a new shape containing the same number of units. -- __Input shape__: This layer does not assume a specific input shape. As a result, it cannot be used as the first layer in a model. +- __Input shape__: This layer does not assume a specific input shape. - __Output shape__: `(nb_samples, *dims)`. @@ -193,7 +203,7 @@ keras.layers.core.RepeatVector(n) Repeat the 1D input n times. Dimensions of input are assumed to be (nb_samples, dim). Output will have the shape (nb_samples, n, dim). -- __Input shape__: This layer does not assume a specific input shape. As a result, it cannot be used as the first layer in a model. +- __Input shape__: This layer does not assume a specific input shape. This layer cannot be used as the first layer in a model. - __Output shape__: `(nb_samples, n, input_dims)`. diff --git a/docs/sources/regularizers.md b/docs/sources/regularizers.md index f900f0cbb..c52d713d6 100644 --- a/docs/sources/regularizers.md +++ b/docs/sources/regularizers.md @@ -1,19 +1,17 @@ -# Regularizers - ## Usage of regularizers -Regularizers allow the use of penalties on particular sets of parameters during optimization. +Regularizers allow to apply penalties on network parameters during optimization. -A penalty is initilized with its weight during optimization: `l1(.05)`. The keyword arguments used for passing penalties to parameters in a layer will depend on the layer. -For weights in the `Dense` layer it is simply `W_regularizer`. -For biases in the `Dense` layer it is simply `b_regularizer`. + +In the `Dense` layer it is simply `W_regularizer` for the main weights matrix, and `b_regularizer` for the bias. ```python +from keras.regularizers import l2 model.add(Dense(64, 64, W_regularizer = l2(.01))) ``` ## Available penalties -- __l1__: L1 regularization penalty, also known as LASSO -- __l2__: L2 regularization penalty, also known as weight decay, or Ridge +- __l1__(l=0.01): L1 regularization penalty, also known as LASSO +- __l2__(l=0.01): L2 regularization penalty, also known as weight decay, or Ridge