Keras es una biblioteca para manipular redes neuronales. Es una capa de alto nivel por arriba de Theano (o de TensorFlow).
Construcción de una red neuronal:
%matplotlib inline
from keras.layers import Dense
from keras.models import Sequential
model = Sequential()
model.add(Dense(output_dim=2, input_dim=5, activation="sigmoid"))
model.add(Dense(output_dim=1, activation="sigmoid"))
Podemos ver una descripción del modelo:
model.summary()
____________________________________________________________________________________________________ Layer (type) Output Shape Param # Connected to ==================================================================================================== dense_13 (Dense) (None, 2) 12 dense_input_5[0][0] ____________________________________________________________________________________________________ dense_14 (Dense) (None, 1) 3 dense_13[0][0] ==================================================================================================== Total params: 15 ____________________________________________________________________________________________________
Así como también podemos visualizarlo:
from IPython.display import SVG
from keras.utils.visualize_util import model_to_dot
SVG(model_to_dot(model).create(prog='dot', format='svg'))
Luego hay que compilar:
model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['accuracy'])
Usamos un conjunto de datos de ejemplo:
import numpy as np
def dataset(n_train, n_test):
n = n_train + n_test
points = np.random.uniform(-3, 3, [n, 2])
features = np.c_[points, points[:, 0]**2, points[:, 1]**2, points[:, 0] * points[:, 1]]
labels = (np.linalg.norm(points, axis=1) > 2).astype(int)
return (features[:n_train], labels[:n_train]), (features[n_train:], labels[n_train:])
(X_train, y_train), (X_test, y_test) = dataset(8000, 2000)
Entrenamos la red:
model.fit(X_train, y_train, nb_epoch=50, batch_size=32)
Epoch 1/50 8000/8000 [==============================] - 0s - loss: 0.7362 - acc: 0.4261 Epoch 2/50 8000/8000 [==============================] - 0s - loss: 0.6680 - acc: 0.5556 Epoch 3/50 8000/8000 [==============================] - 0s - loss: 0.6313 - acc: 0.6415 Epoch 4/50 8000/8000 [==============================] - 0s - loss: 0.6026 - acc: 0.6461 Epoch 5/50 8000/8000 [==============================] - 0s - loss: 0.5858 - acc: 0.6461 Epoch 6/50 8000/8000 [==============================] - 0s - loss: 0.5725 - acc: 0.6461 Epoch 7/50 8000/8000 [==============================] - 0s - loss: 0.5593 - acc: 0.6461 Epoch 8/50 8000/8000 [==============================] - 0s - loss: 0.5454 - acc: 0.6461 Epoch 9/50 8000/8000 [==============================] - 0s - loss: 0.5304 - acc: 0.6616 Epoch 10/50 8000/8000 [==============================] - 0s - loss: 0.5143 - acc: 0.7059 Epoch 11/50 8000/8000 [==============================] - 0s - loss: 0.4968 - acc: 0.7360 Epoch 12/50 8000/8000 [==============================] - 0s - loss: 0.4780 - acc: 0.7723 Epoch 13/50 8000/8000 [==============================] - 0s - loss: 0.4580 - acc: 0.7986 Epoch 14/50 8000/8000 [==============================] - 0s - loss: 0.4374 - acc: 0.8245 Epoch 15/50 8000/8000 [==============================] - 0s - loss: 0.4168 - acc: 0.8471 Epoch 16/50 8000/8000 [==============================] - 0s - loss: 0.3967 - acc: 0.8641 Epoch 17/50 8000/8000 [==============================] - 0s - loss: 0.3775 - acc: 0.8802 Epoch 18/50 8000/8000 [==============================] - 0s - loss: 0.3593 - acc: 0.8908 Epoch 19/50 8000/8000 [==============================] - 0s - loss: 0.3423 - acc: 0.9059 Epoch 20/50 8000/8000 [==============================] - 0s - loss: 0.3262 - acc: 0.9139 Epoch 21/50 8000/8000 [==============================] - 0s - loss: 0.3110 - acc: 0.9226 Epoch 22/50 8000/8000 [==============================] - 0s - loss: 0.2966 - acc: 0.9289 Epoch 23/50 8000/8000 [==============================] - 0s - loss: 0.2831 - acc: 0.9366 Epoch 24/50 8000/8000 [==============================] - 0s - loss: 0.2704 - acc: 0.9405 Epoch 25/50 8000/8000 [==============================] - 0s - loss: 0.2585 - acc: 0.9467 Epoch 26/50 8000/8000 [==============================] - 0s - loss: 0.2475 - acc: 0.9506 Epoch 27/50 8000/8000 [==============================] - 0s - loss: 0.2373 - acc: 0.9545 Epoch 28/50 8000/8000 [==============================] - 0s - loss: 0.2280 - acc: 0.9591 Epoch 29/50 8000/8000 [==============================] - 0s - loss: 0.2194 - acc: 0.9626 Epoch 30/50 8000/8000 [==============================] - 0s - loss: 0.2114 - acc: 0.9631 Epoch 31/50 8000/8000 [==============================] - 0s - loss: 0.2040 - acc: 0.9667 Epoch 32/50 8000/8000 [==============================] - 0s - loss: 0.1970 - acc: 0.9683 Epoch 33/50 8000/8000 [==============================] - 0s - loss: 0.1905 - acc: 0.9717 Epoch 34/50 8000/8000 [==============================] - 0s - loss: 0.1844 - acc: 0.9744 Epoch 35/50 8000/8000 [==============================] - 0s - loss: 0.1787 - acc: 0.9745 Epoch 36/50 8000/8000 [==============================] - 0s - loss: 0.1733 - acc: 0.9754 Epoch 37/50 8000/8000 [==============================] - 0s - loss: 0.1682 - acc: 0.9780 Epoch 38/50 8000/8000 [==============================] - 0s - loss: 0.1633 - acc: 0.9779 Epoch 39/50 8000/8000 [==============================] - 0s - loss: 0.1587 - acc: 0.9793 Epoch 40/50 8000/8000 [==============================] - 0s - loss: 0.1543 - acc: 0.9786 Epoch 41/50 8000/8000 [==============================] - 0s - loss: 0.1501 - acc: 0.9803 Epoch 42/50 8000/8000 [==============================] - 0s - loss: 0.1460 - acc: 0.9815 Epoch 43/50 8000/8000 [==============================] - 0s - loss: 0.1420 - acc: 0.9809 Epoch 44/50 8000/8000 [==============================] - 0s - loss: 0.1381 - acc: 0.9848 Epoch 45/50 8000/8000 [==============================] - 0s - loss: 0.1342 - acc: 0.9824 Epoch 46/50 8000/8000 [==============================] - 0s - loss: 0.1302 - acc: 0.9853 Epoch 47/50 8000/8000 [==============================] - 0s - loss: 0.1262 - acc: 0.9848 Epoch 48/50 8000/8000 [==============================] - 0s - loss: 0.1224 - acc: 0.9860 Epoch 49/50 8000/8000 [==============================] - 0s - loss: 0.1188 - acc: 0.9846 Epoch 50/50 8000/8000 [==============================] - 0s - loss: 0.1156 - acc: 0.9880
<keras.callbacks.History at 0x7f20ad4bf978>
Podemos evaluar y ver métricas:
loss_and_metrics = model.evaluate(X_test, y_test, batch_size=32)
print()
print()
print('Valor de la función de costo:', loss_and_metrics[0])
print('Acierto:', loss_and_metrics[1])
2000/2000 [==============================] - 0s Valor de la función de costo: 0.111558535993 Acierto: 0.983
Y también podemos ver las clases predecidas y sus probablidades:
y_pred = model.predict_classes(X_test)
probability = model.predict_proba(X_test)
print()
print(y_pred)
print(probability)
1760/2000 [=========================>....] - ETA: 0s [[1] [0] [1] ..., [0] [1] [1]] [[ 0.90671945] [ 0.0555241 ] [ 0.98214972] ..., [ 0.11941245] [ 0.97730196] [ 0.90484691]]
import matplotlib.pyplot as plt
def plot(inside, outside):
if inside.any():
plt.plot(inside[:, 0], inside[:, 1], 'bo')
if outside.any():
plt.plot(outside[:, 0], outside[:, 1], 'ro')
circle = plt.Circle((0, 0), radius=2, color='g', fill=False)
ax = plt.gca()
ax.set_aspect(1)
ax.add_patch(circle)
plt.show()
points_test = X_test[:, :2]
inside = np.array([x for x, y in zip(points_test, y_pred) if y == 0])
outside = np.array([x for x, y in zip(points_test, y_pred) if y == 1])
plot(inside, outside)
También se puede instanciar "manualmente" el optimizador para ajustar parámetros como la constante de aprendizaje, el momento y la desaceleración de la constante de aprendizaje:
from keras.optimizers import SGD
model.compile(loss='binary_crossentropy', optimizer=SGD(lr=0.01, momentum=0.9, decay=0.001), metrics=['accuracy'])
Keras provee una API para usar los modelos como si fueran parte de scikit-learn.
from keras.wrappers import scikit_learn
from sklearn import cross_validation
classifier = scikit_learn.KerasClassifier(build_fn=lambda: model, nb_epoch=5)
# build_fn debe devolver un modelo compilado.
classifier.fit(X_train, y_train)
kfold = cross_validation.StratifiedKFold(y=y_train, n_folds=3, shuffle=True)
results = cross_validation.cross_val_score(classifier, X_train, y_train, cv=kfold)
Epoch 1/5 8000/8000 [==============================] - 0s - loss: 0.1041 - acc: 0.9882 Epoch 2/5 8000/8000 [==============================] - 0s - loss: 0.0892 - acc: 0.9924 Epoch 3/5 8000/8000 [==============================] - 0s - loss: 0.0808 - acc: 0.9923 Epoch 4/5 8000/8000 [==============================] - 0s - loss: 0.0752 - acc: 0.9923 Epoch 5/5 8000/8000 [==============================] - 0s - loss: 0.0709 - acc: 0.9946 Epoch 1/5 5333/5333 [==============================] - 0s - loss: 0.0679 - acc: 0.9944 Epoch 2/5 5333/5333 [==============================] - 0s - loss: 0.0663 - acc: 0.9946 Epoch 3/5 5333/5333 [==============================] - 0s - loss: 0.0647 - acc: 0.9949 Epoch 4/5 5333/5333 [==============================] - 0s - loss: 0.0632 - acc: 0.9936 Epoch 5/5 5333/5333 [==============================] - 0s - loss: 0.0620 - acc: 0.9946 1984/2667 [=====================>........] - ETA: 0sEpoch 1/5 5333/5333 [==============================] - 0s - loss: 0.0605 - acc: 0.9953 Epoch 2/5 5333/5333 [==============================] - 0s - loss: 0.0594 - acc: 0.9961 Epoch 3/5 5333/5333 [==============================] - 0s - loss: 0.0586 - acc: 0.9949 Epoch 4/5 5333/5333 [==============================] - 0s - loss: 0.0577 - acc: 0.9957 Epoch 5/5 5333/5333 [==============================] - 0s - loss: 0.0571 - acc: 0.9953 1472/2667 [===============>..............] - ETA: 0sEpoch 1/5 5334/5334 [==============================] - 0s - loss: 0.0572 - acc: 0.9961 Epoch 2/5 5334/5334 [==============================] - 0s - loss: 0.0563 - acc: 0.9963 Epoch 3/5 5334/5334 [==============================] - 0s - loss: 0.0558 - acc: 0.9983 Epoch 4/5 5334/5334 [==============================] - 0s - loss: 0.0553 - acc: 0.9966 Epoch 5/5 5334/5334 [==============================] - 0s - loss: 0.0547 - acc: 0.9972 1376/2666 [==============>...............] - ETA: 0s
print('Acierto:', results.mean())
Acierto: 0.996625140537
También está disponible la clase KerasRegressor
. Con esto se pueden hacer cosas como usar GridSearchCV
para buscar la combinación de parámetros que mejoran los resultados (valores de la constante de aprendizaje, momento, etc), así como también construir un pipeline en donde el modelo puede tomar datos a partir de otros modelos de scikit-learn.
Este es un ejemplo que aprovecha a mostrar más funcionalidades.
Se comienza estableciendo algunos parámetros:
from keras.preprocessing import sequence
from keras.layers import Dropout
from keras.layers import Embedding
from keras.layers import LSTM, GRU, SimpleRNN
from keras.layers import Convolution1D, MaxPooling1D
from keras.datasets import imdb
# Embedding
max_features = 20000
maxlen = 100
embedding_size = 128
# Convolution
filter_length = 5
nb_filter = 64
pool_length = 4
# LSTM
lstm_output_size = 70
# Training
batch_size = 30
nb_epoch = 2
Se cargan los datos:
print('Cargando datos...')
(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features)
print(len(X_train), 'ejemplos de entrenamiento')
print(len(X_test), 'ejemplos de evaluación')
Cargando datos... 20000 ejemplos de entrenamiento 5000 ejemplos de evaluación
Veamos dos de los ejemplos:
print(X_train[1])
print()
print(X_train[16])
[1, 621, 6, 135, 101, 84, 392, 27, 20, 133, 1522, 63, 6401, 6843, 896, 11, 213, 149, 9, 417, 180, 1748, 32, 63, 31, 525, 7, 78, 42, 147, 66, 644, 113, 89, 8, 21, 147, 89, 13, 1851, 3994, 43, 170, 6, 60, 21, 296, 35, 1310, 214, 6, 789, 7146, 7, 15, 16, 12, 14, 15, 16, 12, 14, 5, 927, 10, 5, 6843, 13, 62, 23, 652, 25, 2046, 927, 7, 73, 574, 49, 5, 6843, 1264, 56, 135, 46, 38, 6, 5, 191, 342, 10, 14274, 10, 6843, 16849, 2572, 13, 627, 7, 412, 18, 361, 6, 20, 11393, 342, 17222, 45, 241, 382, 5, 28, 7, 15, 16, 12, 14, 15, 16, 12, 14, 5, 132, 18, 5, 28, 24, 3773, 209, 6, 2380, 61, 6, 2082, 146, 10885, 6, 2962, 146, 1003, 6, 523, 146, 910, 6, 99, 7, 19, 165, 266, 53, 23, 460, 6, 29, 33, 199, 190, 11, 41, 286, 8436, 11, 186, 17, 7, 5, 78, 1522, 24, 89, 33, 4317, 17, 551, 1851, 3994, 43, 37, 240, 40, 635, 9, 189, 331, 4183, 45, 5, 2, 6, 102, 37, 24, 5, 137, 18, 5, 757, 7, 15, 16, 12, 14, 15, 16, 12, 14, 25, 26, 212, 63, 20, 30, 13, 36, 11, 41, 635, 636, 7, 53, 230, 35, 212, 43, 46, 199, 6843, 26, 2539, 61, 1401, 7, 5, 453, 4693, 231, 112, 40, 93, 4232, 27, 9, 5926, 2, 56, 127, 7, 78, 124, 20, 30, 18, 9, 3245, 617, 2806, 6, 911, 19, 66, 82, 64, 681, 4058, 11, 5, 1324, 7] [1, 19, 101, 186, 17, 9, 243, 3850, 7, 295, 17, 23, 57, 102, 19, 158, 131, 129, 9, 9201, 328, 32, 3071, 4423, 43, 31, 6, 29, 19, 57, 107, 35, 94, 100, 17, 52, 9, 192, 10, 97, 95, 101, 7, 5, 1969, 13, 80, 103, 171, 25, 2175, 19621, 9, 4563, 6, 9, 154, 21, 313, 18, 72, 447, 193, 76, 19, 160, 5864, 43]
Cada ejemplo es una secuencia de índices. Se transforman a vectores de largo fijo, rellenando con ceros:
X_train = sequence.pad_sequences(X_train, maxlen=maxlen)
X_test = sequence.pad_sequences(X_test, maxlen=maxlen)
print('Tamaño de X_train:', X_train.shape)
print('Tamaño de X_test:', X_test.shape)
Tamaño de X_train: (20000, 100) Tamaño de X_test: (5000, 100)
Los mismos dos ejemplos ahora:
print(X_train[1])
print()
print(X_train[16])
[ 24 89 33 4317 17 551 1851 3994 43 37 240 40 635 9 189 331 4183 45 5 2 6 102 37 24 5 137 18 5 757 7 15 16 12 14 15 16 12 14 25 26 212 63 20 30 13 36 11 41 635 636 7 53 230 35 212 43 46 199 6843 26 2539 61 1401 7 5 453 4693 231 112 40 93 4232 27 9 5926 2 56 127 7 78 124 20 30 18 9 3245 617 2806 6 911 19 66 82 64 681 4058 11 5 1324 7] [ 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 19 101 186 17 9 243 3850 7 295 17 23 57 102 19 158 131 129 9 9201 328 32 3071 4423 43 31 6 29 19 57 107 35 94 100 17 52 9 192 10 97 95 101 7 5 1969 13 80 103 171 25 2175 19621 9 4563 6 9 154 21 313 18 72 447 193 76 19 160 5864 43]
Ahora construimos un modelo:
model = Sequential()
model.add(Embedding(max_features, embedding_size, input_length=maxlen))
model.add(Dropout(0.25))
model.add(Convolution1D(nb_filter=nb_filter,
filter_length=filter_length,
border_mode='valid',
activation='relu',
subsample_length=1))
model.add(MaxPooling1D(pool_length=pool_length))
model.add(LSTM(lstm_output_size))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
Visualización
SVG(model_to_dot(model).create(prog='dot', format='svg'))
Lo entrenamos y evaluamos:
print('Entrenando...')
model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=nb_epoch,
validation_data=(X_test, y_test))
score, acc = model.evaluate(X_test, y_test, batch_size=batch_size)
print('Valor de la función de costo:', score)
print('Acierto:', acc)
Entrenando... Train on 20000 samples, validate on 5000 samples Epoch 1/2 20000/20000 [==============================] - 137s - loss: 0.4298 - acc: 0.7888 - val_loss: 0.3346 - val_acc: 0.8492 Epoch 2/2 20000/20000 [==============================] - 138s - loss: 0.2130 - acc: 0.9182 - val_loss: 0.3381 - val_acc: 0.8504 5000/5000 [==============================] - 9s Valor de la función de costo: 0.338069551766 Acierto: 0.850399994493
Desafortunadamente los datos de este ejemplo tienen solamente los índices de las palabras, así que no podemos probar el modelo con texto.
Hay muchas capas disponibles. Se destacan:
Dense
Activation
Merge
: permite combinar varias capas ya sea concatenándolas, sumándolas, etc.Reshape
Permute
RepeatVector
LSTM
)Embedding
: permite representar con vectores de dimensión fija a números naturales de un rango dado (índices). Los pesos que se entrenan en esta capa son las entradas de la matriz que hace corresponder vectores a índices (que oficia de tabla de look-up).BatchNormalization
: para dejar la norma de los vectores cercana a 0 y con desviación estándar cercana a 1.Dropout
: Inhibe aleatoria una fracción p
de las entradas.GaussianNoise
: Agrega un ruido gaussiano a la entrada.Clasificación de imágenes.
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Convolution2D, MaxPooling2D
from keras.optimizers import SGD
from keras.utils import np_utils
batch_size = 32
nb_classes = 10
nb_epoch = 1
data_augmentation = False
# input image dimensions
img_rows, img_cols = 32, 32
# the CIFAR10 images are RGB
img_channels = 3
(X_train, y_train), (X_test, y_test) = cifar10.load_data()
print('Dimensión de X_train:', X_train.shape)
print(X_train.shape[0], 'ejemplos de entrenamiento')
print(X_test.shape[0], 'ejemplos de evaluación')
print()
print('X_train[0]:', X_train[0])
Dimensión de X_train: (50000, 3, 32, 32) 50000 ejemplos de entrenamiento 10000 ejemplos de evaluación X_train[0]: [[[ 59 43 50 ..., 158 152 148] [ 16 0 18 ..., 123 119 122] [ 25 16 49 ..., 118 120 109] ..., [208 201 198 ..., 160 56 53] [180 173 186 ..., 184 97 83] [177 168 179 ..., 216 151 123]] [[ 62 46 48 ..., 132 125 124] [ 20 0 8 ..., 88 83 87] [ 24 7 27 ..., 84 84 73] ..., [170 153 161 ..., 133 31 34] [139 123 144 ..., 148 62 53] [144 129 142 ..., 184 118 92]] [[ 63 45 43 ..., 108 102 103] [ 20 0 0 ..., 55 50 57] [ 21 0 8 ..., 50 50 42] ..., [ 96 34 26 ..., 70 7 20] [ 96 42 30 ..., 94 34 34] [116 94 87 ..., 140 84 72]]]
print('y_train[0]:', y_train[0])
print('y_train[1]:', y_train[1])
y_train[0]: [6] y_train[1]: [9]
Transformamos las etiquetas a one-hot encoding:
Y_train = np_utils.to_categorical(y_train, nb_classes)
Y_test = np_utils.to_categorical(y_test, nb_classes)
print('Y_train[0]:', Y_train[0])
print('Y_train[1]:', Y_train[1])
Y_train[0]: [ 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.] Y_train[1]: [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]
model = Sequential()
model.add(Convolution2D(32, 3, 3, border_mode='same', input_shape=(img_channels, img_rows, img_cols)))
model.add(Activation('relu'))
model.add(Convolution2D(32, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Convolution2D(64, 3, 3, border_mode='same'))
model.add(Activation('relu'))
model.add(Convolution2D(64, 3, 3))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
SVG(model_to_dot(model).create(prog='dot', format='svg'))
Cambiamos el intervalo de los datos a [0, 1]:
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
if not data_augmentation:
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch,
validation_data=(X_test, Y_test), shuffle=True)
else:
datagen = ImageDataGenerator(
featurewise_center=False, # set input mean to 0 over the dataset
samplewise_center=False, # set each sample mean to 0
featurewise_std_normalization=False, # divide inputs by std of the dataset
samplewise_std_normalization=False, # divide each input by its std
zca_whitening=False, # apply ZCA whitening
rotation_range=0, # randomly rotate images in the range (degrees, 0 to 180)
width_shift_range=0.1, # randomly shift images horizontally (fraction of total width)
height_shift_range=0.1, # randomly shift images vertically (fraction of total height)
horizontal_flip=True, # randomly flip images
vertical_flip=False) # randomly flip images
# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(X_train)
model.fit_generator(datagen.flow(X_train, Y_train, batch_size=batch_size),
samples_per_epoch=X_train.shape[0], nb_epoch=nb_epoch,
validation_data=(X_test, Y_test))
Train on 50000 samples, validate on 10000 samples Epoch 1/1 50000/50000 [==============================] - 898s - loss: 1.7448 - acc: 0.3547 - val_loss: 1.2809 - val_acc: 0.5377
X_train = X_train.reshape(50000, 3072)
X_test = X_test.reshape(10000, 3072)
model2 = Sequential()
model2.add(Dense(100, input_dim=3072, activation='relu'))
model2.add(Dense(10, activation='relu'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model2.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
SVG(model_to_dot(model2).create(prog='dot', format='svg'))
model2.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch, validation_data=(X_test, Y_test), shuffle=True)
Train on 50000 samples, validate on 10000 samples Epoch 1/1 50000/50000 [==============================] - 18s - loss: 7.6269 - acc: 0.1000 - val_loss: 7.5525 - val_acc: 0.1000
<keras.callbacks.History at 0x7f20a9abcd68>
Para tener más flexibilidad a la hora de construir una red neuronal que con Sequential
, se puede usar esta parte de la biblioteca. Se pueden hacer cosas como:
Un ejemplo simple:
from keras.layers import Input, Embedding, LSTM, Dense, merge
from keras.models import Model
main_input = Input(shape=(100,), dtype='int32', name='main_input')
x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input)
lstm_out = LSTM(32)(x)
auxiliary_loss = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)
auxiliary_input = Input(shape=(5,), name='aux_input')
x = merge([lstm_out, auxiliary_input], mode='concat')
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
main_loss = Dense(1, activation='sigmoid', name='main_output')(x)
model = Model(input=[main_input, auxiliary_input], output=[main_loss, auxiliary_loss])
model.compile(optimizer='rmsprop', loss='binary_crossentropy',
loss_weights=[1., 0.2])
SVG(model_to_dot(model).create(prog='dot', format='svg'))
Permiten realizar acciones luego de ciertos eventos como al terminar de entrenar un batch o un epoch. Casos de usos de ejemplo:
Hay más cosas que se vieron poco acá pero que vale la pena mencionar que existen: