keras/examples/mnist_siamese_graph.py

'''Train a Siamese MLP on pairs of digits from the MNIST dataset.

It follows Hadsell-et-al.'06 [1] by computing the Euclidean distance on the
output of the shared network and by optimizing the contrastive loss (see paper
for mode details).

[1] "Dimensionality Reduction by Learning an Invariant Mapping"
    http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf

Gets to 99.5% test accuracy after 20 epochs.
3 seconds per epoch on a Titan X GPU
'''
from __future__ import absolute_import
from __future__ import print_function
import numpy as np

import random
from keras.datasets import mnist
from keras.models import Sequential, Model
from keras.layers import Dense, Dropout, Input, Lambda
from keras.optimizers import RMSprop
from keras import backend as K


def euclidean_distance(vects):
    x, y = vects
    return K.sqrt(K.sum(K.square(x - y), axis=1, keepdims=True))


def eucl_dist_output_shape(shapes):
    shape1, shape2 = shapes
    return (shape1[0], 1)


def contrastive_loss(y_true, y_pred):
    '''Contrastive loss from Hadsell-et-al.'06
    http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf
    '''
    margin = 1
    return K.mean(y_true * K.square(y_pred) +
                  (1 - y_true) * K.square(K.maximum(margin - y_pred, 0)))


def create_pairs(x, digit_indices):
    '''Positive and negative pair creation.
    Alternates between positive and negative pairs.
    '''
    pairs = []
    labels = []
    n = min([len(digit_indices[d]) for d in range(10)]) - 1
    for d in range(10):
        for i in range(n):
            z1, z2 = digit_indices[d][i], digit_indices[d][i + 1]
            pairs += [[x[z1], x[z2]]]
            inc = random.randrange(1, 10)
            dn = (d + inc) % 10
            z1, z2 = digit_indices[d][i], digit_indices[dn][i]
            pairs += [[x[z1], x[z2]]]
            labels += [1, 0]
    return np.array(pairs), np.array(labels)


def create_base_network(input_dim):
    '''Base network to be shared (eq. to feature extraction).
    '''
    seq = Sequential()
    seq.add(Dense(128, input_shape=(input_dim,), activation='relu'))
    seq.add(Dropout(0.1))
    seq.add(Dense(128, activation='relu'))
    seq.add(Dropout(0.1))
    seq.add(Dense(128, activation='relu'))
    return seq


def compute_accuracy(predictions, labels):
    '''Compute classification accuracy with a fixed threshold on distances.
    '''
    return labels[predictions.ravel() < 0.5].mean()


# the data, shuffled and split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(60000, 784)
x_test = x_test.reshape(10000, 784)
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
input_dim = 784
epochs = 20

# create training+test positive and negative pairs
digit_indices = [np.where(y_train == i)[0] for i in range(10)]
tr_pairs, tr_y = create_pairs(x_train, digit_indices)

digit_indices = [np.where(y_test == i)[0] for i in range(10)]
te_pairs, te_y = create_pairs(x_test, digit_indices)

# network definition
base_network = create_base_network(input_dim)

input_a = Input(shape=(input_dim,))
input_b = Input(shape=(input_dim,))

# because we re-use the same instance `base_network`,
# the weights of the network
# will be shared across the two branches
processed_a = base_network(input_a)
processed_b = base_network(input_b)

distance = Lambda(euclidean_distance,
                  output_shape=eucl_dist_output_shape)([processed_a, processed_b])

model = Model([input_a, input_b], distance)

# train
rms = RMSprop()
model.compile(loss=contrastive_loss, optimizer=rms)
model.fit([tr_pairs[:, 0], tr_pairs[:, 1]], tr_y,
          batch_size=128,
          epochs=epochs,
          validation_data=([te_pairs[:, 0], te_pairs[:, 1]], te_y))

# compute final accuracy on training and test sets
pred = model.predict([tr_pairs[:, 0], tr_pairs[:, 1]])
tr_acc = compute_accuracy(pred, tr_y)
pred = model.predict([te_pairs[:, 0], te_pairs[:, 1]])
te_acc = compute_accuracy(pred, te_y)

print('* Accuracy on training set: %0.2f%%' % (100 * tr_acc))
print('* Accuracy on test set: %0.2f%%' % (100 * te_acc))
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00			`'''Train a Siamese MLP on pairs of digits from the MNIST dataset.`

			`It follows Hadsell-et-al.'06 [1] by computing the Euclidean distance on the`
			`output of the shared network and by optimizing the contrastive loss (see paper`
			`for mode details).`

			`[1] "Dimensionality Reduction by Learning an Invariant Mapping"`
			`http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf`

TF fixes and style fixes 2016-02-06 21:49:57 +00:00			`Gets to 99.5% test accuracy after 20 epochs.`
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00			`3 seconds per epoch on a Titan X GPU`
			`'''`
			`from __future__ import absolute_import`
			`from __future__ import print_function`
			`import numpy as np`

			`import random`
			`from keras.datasets import mnist`
Keras 1.0 preview. 2016-03-19 16:07:15 +00:00			`from keras.models import Sequential, Model`
			`from keras.layers import Dense, Dropout, Input, Lambda`
Remove unused imports and unused variables (#4930) 2017-01-06 17:25:03 +00:00			`from keras.optimizers import RMSprop`
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00			`from keras import backend as K`


Keras 1.0 preview. 2016-03-19 16:07:15 +00:00			`def euclidean_distance(vects):`
			`x, y = vects`
			`return K.sqrt(K.sum(K.square(x - y), axis=1, keepdims=True))`
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00

Fix siamese example 2016-04-13 19:05:42 +00:00			`def eucl_dist_output_shape(shapes):`
			`shape1, shape2 = shapes`
fixed shape typo (#2679) * fixed shape typo * pep8 2016-05-10 05:17:12 +00:00			`return (shape1[0], 1)`
Fix siamese example 2016-04-13 19:05:42 +00:00

Keras 1.0 preview. 2016-03-19 16:07:15 +00:00			`def contrastive_loss(y_true, y_pred):`
TF fixes and style fixes 2016-02-06 21:49:57 +00:00			`'''Contrastive loss from Hadsell-et-al.'06`
			`http://yann.lecun.com/exdb/publis/pdf/hadsell-chopra-lecun-06.pdf`
			`'''`
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00			`margin = 1`
Finish updating examples. 2017-03-12 03:44:29 +00:00			`return K.mean(y_true * K.square(y_pred) +`
			`(1 - y_true) * K.square(K.maximum(margin - y_pred, 0)))`
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00

			`def create_pairs(x, digit_indices):`
TF fixes and style fixes 2016-02-06 21:49:57 +00:00			`'''Positive and negative pair creation.`
			`Alternates between positive and negative pairs.`
			`'''`
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00			`pairs = []`
			`labels = []`
			`n = min([len(digit_indices[d]) for d in range(10)]) - 1`
			`for d in range(10):`
			`for i in range(n):`
PEP8 fixes in examples. 2017-01-11 19:39:58 +00:00			`z1, z2 = digit_indices[d][i], digit_indices[d][i + 1]`
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00			`pairs += [[x[z1], x[z2]]]`
			`inc = random.randrange(1, 10)`
			`dn = (d + inc) % 10`
			`z1, z2 = digit_indices[d][i], digit_indices[dn][i]`
			`pairs += [[x[z1], x[z2]]]`
			`labels += [1, 0]`
			`return np.array(pairs), np.array(labels)`


TF fixes and style fixes 2016-02-06 21:49:57 +00:00			`def create_base_network(input_dim):`
			`'''Base network to be shared (eq. to feature extraction).`
			`'''`
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00			`seq = Sequential()`
TF fixes and style fixes 2016-02-06 21:49:57 +00:00			`seq.add(Dense(128, input_shape=(input_dim,), activation='relu'))`
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00			`seq.add(Dropout(0.1))`
			`seq.add(Dense(128, activation='relu'))`
			`seq.add(Dropout(0.1))`
			`seq.add(Dense(128, activation='relu'))`
			`return seq`


			`def compute_accuracy(predictions, labels):`
TF fixes and style fixes 2016-02-06 21:49:57 +00:00			`'''Compute classification accuracy with a fixed threshold on distances.`
			`'''`
Fix issue in mnist siamese graph example. 2017-01-24 05:36:07 +00:00			`return labels[predictions.ravel() < 0.5].mean()`
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00

Finish PR backporting 2016-04-04 18:30:24 +00:00			`# the data, shuffled and split between train and test sets`
Update several examples to work with the new API (#5548) * Update mnist_transfer_cnn for new API * Update mnist_siamese_graph.py for new API * Refactor example a little bit for clarity * Update mnist_irnn.py for new API * Fix variable name * Update mnist_heirarchial_rnn.py for new api * Fix a few api calls i missed * Update mnist_acgan.py for new API * Fix variable name * Update imdb_cnn for new API * Update benchmark.py to work with new API * PEP8 fix * Change filter_length to kernel_size * Update imdb_cnn_lstm.py for new API * PEP8 indentation fix 2017-02-28 02:53:41 +00:00			`(x_train, y_train), (x_test, y_test) = mnist.load_data()`
			`x_train = x_train.reshape(60000, 784)`
			`x_test = x_test.reshape(10000, 784)`
			`x_train = x_train.astype('float32')`
			`x_test = x_test.astype('float32')`
			`x_train /= 255`
			`x_test /= 255`
TF fixes and style fixes 2016-02-06 21:49:57 +00:00			`input_dim = 784`
Integration tests passing. 2017-02-15 00:08:30 +00:00			`epochs = 20`
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00
			`# create training+test positive and negative pairs`
			`digit_indices = [np.where(y_train == i)[0] for i in range(10)]`
Update several examples to work with the new API (#5548) * Update mnist_transfer_cnn for new API * Update mnist_siamese_graph.py for new API * Refactor example a little bit for clarity * Update mnist_irnn.py for new API * Fix variable name * Update mnist_heirarchial_rnn.py for new api * Fix a few api calls i missed * Update mnist_acgan.py for new API * Fix variable name * Update imdb_cnn for new API * Update benchmark.py to work with new API * PEP8 fix * Change filter_length to kernel_size * Update imdb_cnn_lstm.py for new API * PEP8 indentation fix 2017-02-28 02:53:41 +00:00			`tr_pairs, tr_y = create_pairs(x_train, digit_indices)`
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00
			`digit_indices = [np.where(y_test == i)[0] for i in range(10)]`
Update several examples to work with the new API (#5548) * Update mnist_transfer_cnn for new API * Update mnist_siamese_graph.py for new API * Refactor example a little bit for clarity * Update mnist_irnn.py for new API * Fix variable name * Update mnist_heirarchial_rnn.py for new api * Fix a few api calls i missed * Update mnist_acgan.py for new API * Fix variable name * Update imdb_cnn for new API * Update benchmark.py to work with new API * PEP8 fix * Change filter_length to kernel_size * Update imdb_cnn_lstm.py for new API * PEP8 indentation fix 2017-02-28 02:53:41 +00:00			`te_pairs, te_y = create_pairs(x_test, digit_indices)`
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00
			`# network definition`
TF fixes and style fixes 2016-02-06 21:49:57 +00:00			`base_network = create_base_network(input_dim)`
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00
Keras 1.0 preview. 2016-03-19 16:07:15 +00:00			`input_a = Input(shape=(input_dim,))`
			`input_b = Input(shape=(input_dim,))`

			# because we re-use the same instance `base_network`,
			`# the weights of the network`
			`# will be shared across the two branches`
			`processed_a = base_network(input_a)`
			`processed_b = base_network(input_b)`

Finish updating examples. 2017-03-12 03:44:29 +00:00			`distance = Lambda(euclidean_distance,`
			`output_shape=eucl_dist_output_shape)([processed_a, processed_b])`
Keras 1.0 preview. 2016-03-19 16:07:15 +00:00
Update several examples to work with the new API (#5548) * Update mnist_transfer_cnn for new API * Update mnist_siamese_graph.py for new API * Refactor example a little bit for clarity * Update mnist_irnn.py for new API * Fix variable name * Update mnist_heirarchial_rnn.py for new api * Fix a few api calls i missed * Update mnist_acgan.py for new API * Fix variable name * Update imdb_cnn for new API * Update benchmark.py to work with new API * PEP8 fix * Change filter_length to kernel_size * Update imdb_cnn_lstm.py for new API * PEP8 indentation fix 2017-02-28 02:53:41 +00:00			`model = Model([input_a, input_b], distance)`
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00
			`# train`
			`rms = RMSprop()`
Keras 1.0 preview. 2016-03-19 16:07:15 +00:00			`model.compile(loss=contrastive_loss, optimizer=rms)`
			`model.fit([tr_pairs[:, 0], tr_pairs[:, 1]], tr_y,`
			`batch_size=128,`
Style fix for examples. (#5980) 2017-03-26 14:27:49 +00:00			`epochs=epochs,`
			`validation_data=([te_pairs[:, 0], te_pairs[:, 1]], te_y))`
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00
			`# compute final accuracy on training and test sets`
Keras 1.0 preview. 2016-03-19 16:07:15 +00:00			`pred = model.predict([tr_pairs[:, 0], tr_pairs[:, 1]])`
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00			`tr_acc = compute_accuracy(pred, tr_y)`
Keras 1.0 preview. 2016-03-19 16:07:15 +00:00			`pred = model.predict([te_pairs[:, 0], te_pairs[:, 1]])`
add siamese example use graph model take pairs of digits as input 2016-02-06 21:06:22 +00:00			`te_acc = compute_accuracy(pred, te_y)`

			`print('* Accuracy on training set: %0.2f%%' % (100 * tr_acc))`
			`print('* Accuracy on test set: %0.2f%%' % (100 * te_acc))`