diff --git a/docs/sources/constraints.md b/docs/sources/constraints.md
index 3fdbc3e79..867bca51a 100644
--- a/docs/sources/constraints.md
+++ b/docs/sources/constraints.md
@@ -1,20 +1,18 @@
-#Constraints
-
 ## Usage of constraints
 
-Regularizers allow the use of penalties on particular sets of parameters during optimization.
+Functions from the `constraints` module allow setting constraints (eg. non-negativity) on network parameters during optimization.
 
-A constraint is initilized with the value of the constraint. 
-For example `maxnorm(3)` will constrain the weight vector to each hidden unit to have a maximum norm of 3.
 The keyword arguments used for passing constraints to parameters in a layer will depend on the layer. 
-For weights in the `Dense` layer it is simply `W_constraint`.
-For biases in the `Dense` layer it is simply `b_constraint`.
+
+In the `Dense` layer it is simply `W_constraint` for the main weights matrix, and `b_constraint` for the bias.
+
 
 ```python
+from keras.constraints import maxnorm
 model.add(Dense(64, 64, W_constraint = maxnorm(2)))
 ```
 
 ## Available constraints
 
-- __maxnorm__: maximum-norm constraint
-- __nonneg__: non-negative constraint
\ No newline at end of file
+- __maxnorm__(m=2): maximum-norm constraint
+- __nonneg__(): non-negativity constraint
\ No newline at end of file
diff --git a/docs/sources/layers/core.md b/docs/sources/layers/core.md
index 0f897e660..8e76b231d 100644
--- a/docs/sources/layers/core.md
+++ b/docs/sources/layers/core.md
@@ -70,7 +70,8 @@ Set the weights of the parameters of the layer.
 
 ## Dense
 ```python
-keras.layers.core.Dense(input_dim, output_dim, init='glorot_uniform', activation='linear', weights=None)
+keras.layers.core.Dense(input_dim, output_dim, init='glorot_uniform', activation='linear', weights=None \
+W_regularizer=None, b_regularizer=None, W_constraint=None, b_constraint=None)
 ```
 
 Standard 1D fully-connect layer. 
@@ -86,12 +87,17 @@ Standard 1D fully-connect layer.
     - __init__: name of initialization function for the weights of the layer (see: [initializations](../initializations.md)), or alternatively, Theano function to use for weights initialization. This parameter is only relevant if you don't pass a `weights` argument.
     - __activation__: name of activation function to use (see: [activations](../activations.md)), or alternatively, elementwise Theano function. If you don't specify anything, no activation is applied (ie. "linear" activation: a(x) = x).
     - __weights__: list of numpy arrays to set as initial weights. The list should have 1 element, of shape `(input_dim, output_dim)`.
+    - __W_regularizer__: instance of the [regularizers](../regularizers.md) module (eg. L1 or L2 regularization), applied to the main weights matrix.
+    - __b_regularizer__: instance of the [regularizers](../regularizers.md) module, applied to the bias.
+    - __W_constraint__: instance of the [constraints](../constraints.md) module (eg. maxnorm, nonneg), applied to the main weights matrix.
+    - __b_constraint__: instance of the [constraints](../constraints.md) module, applied to the bias.
 
 ---
 
 ## TimeDistributedDense
 ```python
-keras.layers.core.TimeDistributedDense(input_dim, output_dim, init='glorot_uniform', activation='linear', weights=None)
+keras.layers.core.TimeDistributedDense(input_dim, output_dim, init='glorot_uniform', activation='linear', weights=None \
+W_regularizer=None, b_regularizer=None, W_constraint=None, b_constraint=None)
 ```
 
 Fully-connected layer distributed over the time dimension. Useful after a recurrent network set to `return_sequences=True`.
@@ -104,6 +110,10 @@ Fully-connected layer distributed over the time dimension. Useful after a recurr
     - __init__: name of initialization function for the weights of the layer (see: [initializations](../initializations.md)), or alternatively, Theano function to use for weights initialization. This parameter is only relevant if you don't pass a `weights` argument.
     - __activation__: name of activation function to use (see: [activations](../activations.md)), or alternatively, elementwise Theano function. If you don't specify anything, no activation is applied (ie. "linear" activation: a(x) = x).
     - __weights__: list of numpy arrays to set as initial weights. The list should have 1 element, of shape `(input_dim, output_dim)`.
+    - __W_regularizer__: instance of the [regularizers](../regularizers.md) module (eg. L1 or L2 regularization), applied to the main weights matrix.
+    - __b_regularizer__: instance of the [regularizers](../regularizers.md) module, applied to the bias.
+    - __W_constraint__: instance of the [constraints](../constraints.md) module (eg. maxnorm, nonneg), applied to the main weights matrix.
+    - __b_constraint__: instance of the [constraints](../constraints.md) module, applied to the bias.
 
 - __Example__:
 ```python
@@ -138,7 +148,7 @@ keras.layers.core.Dropout(p)
 ```
 Apply dropout to the input. Dropout consists in randomly setting a fraction `p` of input units to 0 at each update during training time, which helps prevent overfitting. Reference: [Dropout: A Simple Way to Prevent Neural Networks from Overfitting](http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf)
 
-- __Input shape__: This layer does not assume a specific input shape. As a result, it cannot be used as the first layer in a model.
+- __Input shape__: This layer does not assume a specific input shape. 
 
 - __Output shape__: Same as input.
 
@@ -156,7 +166,7 @@ keras.layers.core.Reshape(*dims)
 
 Reshape the input to a new shape containing the same number of units. 
 
-- __Input shape__: This layer does not assume a specific input shape. As a result, it cannot be used as the first layer in a model.
+- __Input shape__: This layer does not assume a specific input shape. 
 
 - __Output shape__: `(nb_samples, *dims)`.
 
@@ -193,7 +203,7 @@ keras.layers.core.RepeatVector(n)
 
 Repeat the 1D input n times. Dimensions of input are assumed to be (nb_samples, dim). Output will have the shape (nb_samples, n, dim).
 
-- __Input shape__: This layer does not assume a specific input shape. As a result, it cannot be used as the first layer in a model.
+- __Input shape__: This layer does not assume a specific input shape. This layer cannot be used as the first layer in a model.
 
 - __Output shape__: `(nb_samples, n, input_dims)`.
 
diff --git a/docs/sources/regularizers.md b/docs/sources/regularizers.md
index f900f0cbb..c52d713d6 100644
--- a/docs/sources/regularizers.md
+++ b/docs/sources/regularizers.md
@@ -1,19 +1,17 @@
-# Regularizers
-
 ## Usage of regularizers
 
-Regularizers allow the use of penalties on particular sets of parameters during optimization.
+Regularizers allow to apply penalties on network parameters during optimization.
 
-A penalty is initilized with its weight during optimization: `l1(.05)`.
 The keyword arguments used for passing penalties to parameters in a layer will depend on the layer. 
-For weights in the `Dense` layer it is simply `W_regularizer`.
-For biases in the `Dense` layer it is simply `b_regularizer`.
+
+In the `Dense` layer it is simply `W_regularizer` for the main weights matrix, and `b_regularizer` for the bias.
 
 ```python
+from keras.regularizers import l2
 model.add(Dense(64, 64, W_regularizer = l2(.01)))
 ```
 
 ## Available penalties
 
-- __l1__: L1 regularization penalty, also known as LASSO
-- __l2__: L2 regularization penalty, also known as weight decay, or Ridge
+- __l1__(l=0.01): L1 regularization penalty, also known as LASSO
+- __l2__(l=0.01): L2 regularization penalty, also known as weight decay, or Ridge