# 실행마다 동일한 결과를 얻기 위해 케라스에 랜덤 시드를 사용하고 텐서플로 연산을 결정적으로 만듭니다.
import tensorflow as tf
tf.keras.utils.set_random_seed(42)
tf.config.experimental.enable_op_determinism()
from tensorflow.keras.datasets import imdb
(train_input, train_target), (test_input, test_target) = imdb.load_data(
num_words=300)
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/imdb.npz 17464789/17464789 [==============================] - 0s 0us/step
print(train_input.shape, test_input.shape)
(25000,) (25000,)
print(len(train_input[0]))
218
print(len(train_input[1]))
189
print(train_input[0])
[1, 14, 22, 16, 43, 2, 2, 2, 2, 65, 2, 2, 66, 2, 4, 173, 36, 256, 5, 25, 100, 43, 2, 112, 50, 2, 2, 9, 35, 2, 284, 5, 150, 4, 172, 112, 167, 2, 2, 2, 39, 4, 172, 2, 2, 17, 2, 38, 13, 2, 4, 192, 50, 16, 6, 147, 2, 19, 14, 22, 4, 2, 2, 2, 4, 22, 71, 87, 12, 16, 43, 2, 38, 76, 15, 13, 2, 4, 22, 17, 2, 17, 12, 16, 2, 18, 2, 5, 62, 2, 12, 8, 2, 8, 106, 5, 4, 2, 2, 16, 2, 66, 2, 33, 4, 130, 12, 16, 38, 2, 5, 25, 124, 51, 36, 135, 48, 25, 2, 33, 6, 22, 12, 215, 28, 77, 52, 5, 14, 2, 16, 82, 2, 8, 4, 107, 117, 2, 15, 256, 4, 2, 7, 2, 5, 2, 36, 71, 43, 2, 2, 26, 2, 2, 46, 7, 4, 2, 2, 13, 104, 88, 4, 2, 15, 297, 98, 32, 2, 56, 26, 141, 6, 194, 2, 18, 4, 226, 22, 21, 134, 2, 26, 2, 5, 144, 30, 2, 18, 51, 36, 28, 224, 92, 25, 104, 4, 226, 65, 16, 38, 2, 88, 12, 16, 283, 5, 16, 2, 113, 103, 32, 15, 16, 2, 19, 178, 32]
print(train_target[:20])
[1 0 0 1 0 0 1 0 1 0 1 0 0 0 0 0 1 1 0 1]
from sklearn.model_selection import train_test_split
train_input, val_input, train_target, val_target = train_test_split(
train_input, train_target, test_size=0.2, random_state=42)
import numpy as np
lengths = np.array([len(x) for x in train_input])
print(np.mean(lengths), np.median(lengths))
239.00925 178.0
import matplotlib.pyplot as plt
plt.hist(lengths)
plt.xlabel('length')
plt.ylabel('frequency')
plt.show()
from tensorflow.keras.preprocessing.sequence import pad_sequences
train_seq = pad_sequences(train_input, maxlen=100)
print(train_seq.shape)
(20000, 100)
print(train_seq[0])
[ 10 4 20 9 2 2 2 5 45 6 2 2 33 269 8 2 142 2 5 2 17 73 17 204 5 2 19 55 2 2 92 66 104 14 20 93 76 2 151 33 4 58 12 188 2 151 12 215 69 224 142 73 237 6 2 7 2 2 188 2 103 14 31 10 10 2 7 2 5 2 80 91 2 30 2 34 14 20 151 50 26 131 49 2 84 46 50 37 80 79 6 2 46 7 14 20 10 10 2 158]
print(train_input[0][-10:])
[6, 2, 46, 7, 14, 20, 10, 10, 2, 158]
print(train_seq[5])
[ 0 0 0 0 1 2 195 19 49 2 2 190 4 2 2 2 183 10 10 13 82 79 4 2 36 71 269 8 2 25 19 49 7 4 2 2 2 2 2 10 10 48 25 40 2 11 2 2 40 2 2 5 4 2 2 95 14 238 56 129 2 10 10 21 2 94 2 2 2 2 11 190 24 2 2 7 94 205 2 10 10 87 2 34 49 2 7 2 2 2 2 2 290 2 46 48 64 18 4 2]
val_seq = pad_sequences(val_input, maxlen=100)
from tensorflow import keras
model = keras.Sequential()
model.add(keras.layers.SimpleRNN(8, input_shape=(100, 300)))
model.add(keras.layers.Dense(1, activation='sigmoid'))
train_oh = keras.utils.to_categorical(train_seq)
print(train_oh.shape)
(20000, 100, 300)
print(train_oh[0][0][:12])
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
print(np.sum(train_oh[0][0]))
1.0
val_oh = keras.utils.to_categorical(val_seq)
model.summary()
Model: "sequential" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= simple_rnn (SimpleRNN) (None, 8) 2472 dense (Dense) (None, 1) 9 ================================================================= Total params: 2481 (9.69 KB) Trainable params: 2481 (9.69 KB) Non-trainable params: 0 (0.00 Byte) _________________________________________________________________
rmsprop = keras.optimizers.RMSprop(learning_rate=1e-4)
model.compile(optimizer=rmsprop, loss='binary_crossentropy',
metrics=['accuracy'])
checkpoint_cb = keras.callbacks.ModelCheckpoint('best-simplernn-model.h5',
save_best_only=True)
early_stopping_cb = keras.callbacks.EarlyStopping(patience=3,
restore_best_weights=True)
history = model.fit(train_oh, train_target, epochs=100, batch_size=64,
validation_data=(val_oh, val_target),
callbacks=[checkpoint_cb, early_stopping_cb])
Epoch 1/100 313/313 [==============================] - 31s 80ms/step - loss: 0.7003 - accuracy: 0.5002 - val_loss: 0.6970 - val_accuracy: 0.5058 Epoch 2/100 2/313 [..............................] - ETA: 17s - loss: 0.7103 - accuracy: 0.5078
/usr/local/lib/python3.10/dist-packages/keras/src/engine/training.py:3079: UserWarning: You are saving your model as an HDF5 file via `model.save()`. This file format is considered legacy. We recommend using instead the native Keras format, e.g. `model.save('my_model.keras')`. saving_api.save_model(
313/313 [==============================] - 22s 70ms/step - loss: 0.6956 - accuracy: 0.5123 - val_loss: 0.6946 - val_accuracy: 0.5124 Epoch 3/100 313/313 [==============================] - 23s 74ms/step - loss: 0.6917 - accuracy: 0.5282 - val_loss: 0.6909 - val_accuracy: 0.5318 Epoch 4/100 313/313 [==============================] - 23s 74ms/step - loss: 0.6844 - accuracy: 0.5549 - val_loss: 0.6833 - val_accuracy: 0.5690 Epoch 5/100 313/313 [==============================] - 24s 75ms/step - loss: 0.6784 - accuracy: 0.5778 - val_loss: 0.6797 - val_accuracy: 0.5770 Epoch 6/100 313/313 [==============================] - 23s 73ms/step - loss: 0.6734 - accuracy: 0.5890 - val_loss: 0.6743 - val_accuracy: 0.5890 Epoch 7/100 313/313 [==============================] - 26s 83ms/step - loss: 0.6676 - accuracy: 0.6068 - val_loss: 0.6695 - val_accuracy: 0.5982 Epoch 8/100 313/313 [==============================] - 21s 68ms/step - loss: 0.6614 - accuracy: 0.6197 - val_loss: 0.6628 - val_accuracy: 0.6112 Epoch 9/100 313/313 [==============================] - 23s 73ms/step - loss: 0.6539 - accuracy: 0.6309 - val_loss: 0.6567 - val_accuracy: 0.6228 Epoch 10/100 313/313 [==============================] - 22s 71ms/step - loss: 0.6461 - accuracy: 0.6445 - val_loss: 0.6481 - val_accuracy: 0.6366 Epoch 11/100 313/313 [==============================] - 22s 70ms/step - loss: 0.6377 - accuracy: 0.6564 - val_loss: 0.6409 - val_accuracy: 0.6486 Epoch 12/100 313/313 [==============================] - 23s 74ms/step - loss: 0.6285 - accuracy: 0.6684 - val_loss: 0.6302 - val_accuracy: 0.6642 Epoch 13/100 313/313 [==============================] - 22s 69ms/step - loss: 0.6191 - accuracy: 0.6804 - val_loss: 0.6211 - val_accuracy: 0.6740 Epoch 14/100 313/313 [==============================] - 23s 73ms/step - loss: 0.6092 - accuracy: 0.6884 - val_loss: 0.6122 - val_accuracy: 0.6824 Epoch 15/100 313/313 [==============================] - 24s 78ms/step - loss: 0.5992 - accuracy: 0.6998 - val_loss: 0.6032 - val_accuracy: 0.6896 Epoch 16/100 313/313 [==============================] - 23s 74ms/step - loss: 0.5900 - accuracy: 0.7071 - val_loss: 0.5935 - val_accuracy: 0.6972 Epoch 17/100 313/313 [==============================] - 23s 73ms/step - loss: 0.5816 - accuracy: 0.7142 - val_loss: 0.5855 - val_accuracy: 0.7028 Epoch 18/100 313/313 [==============================] - 22s 70ms/step - loss: 0.5731 - accuracy: 0.7209 - val_loss: 0.5785 - val_accuracy: 0.7094 Epoch 19/100 313/313 [==============================] - 22s 70ms/step - loss: 0.5648 - accuracy: 0.7260 - val_loss: 0.5691 - val_accuracy: 0.7174 Epoch 20/100 313/313 [==============================] - 23s 73ms/step - loss: 0.5569 - accuracy: 0.7308 - val_loss: 0.5620 - val_accuracy: 0.7214 Epoch 21/100 313/313 [==============================] - 22s 70ms/step - loss: 0.5503 - accuracy: 0.7351 - val_loss: 0.5560 - val_accuracy: 0.7234 Epoch 22/100 313/313 [==============================] - 23s 72ms/step - loss: 0.5443 - accuracy: 0.7391 - val_loss: 0.5515 - val_accuracy: 0.7266 Epoch 23/100 313/313 [==============================] - 22s 72ms/step - loss: 0.5381 - accuracy: 0.7427 - val_loss: 0.5473 - val_accuracy: 0.7300 Epoch 24/100 313/313 [==============================] - 23s 75ms/step - loss: 0.5339 - accuracy: 0.7456 - val_loss: 0.5423 - val_accuracy: 0.7344 Epoch 25/100 313/313 [==============================] - 23s 74ms/step - loss: 0.5297 - accuracy: 0.7477 - val_loss: 0.5385 - val_accuracy: 0.7342 Epoch 26/100 313/313 [==============================] - 23s 73ms/step - loss: 0.5257 - accuracy: 0.7510 - val_loss: 0.5360 - val_accuracy: 0.7376 Epoch 27/100 313/313 [==============================] - 22s 71ms/step - loss: 0.5223 - accuracy: 0.7534 - val_loss: 0.5330 - val_accuracy: 0.7386 Epoch 28/100 313/313 [==============================] - 23s 73ms/step - loss: 0.5192 - accuracy: 0.7556 - val_loss: 0.5315 - val_accuracy: 0.7352 Epoch 29/100 313/313 [==============================] - 21s 68ms/step - loss: 0.5159 - accuracy: 0.7567 - val_loss: 0.5394 - val_accuracy: 0.7338 Epoch 30/100 313/313 [==============================] - 23s 73ms/step - loss: 0.5133 - accuracy: 0.7573 - val_loss: 0.5275 - val_accuracy: 0.7386 Epoch 31/100 313/313 [==============================] - 22s 70ms/step - loss: 0.5116 - accuracy: 0.7596 - val_loss: 0.5254 - val_accuracy: 0.7428 Epoch 32/100 313/313 [==============================] - 22s 71ms/step - loss: 0.5090 - accuracy: 0.7609 - val_loss: 0.5244 - val_accuracy: 0.7450 Epoch 33/100 313/313 [==============================] - 25s 79ms/step - loss: 0.5071 - accuracy: 0.7610 - val_loss: 0.5243 - val_accuracy: 0.7438 Epoch 34/100 313/313 [==============================] - 21s 68ms/step - loss: 0.5047 - accuracy: 0.7625 - val_loss: 0.5242 - val_accuracy: 0.7446 Epoch 35/100 313/313 [==============================] - 24s 78ms/step - loss: 0.5036 - accuracy: 0.7634 - val_loss: 0.5218 - val_accuracy: 0.7460 Epoch 36/100 313/313 [==============================] - 23s 73ms/step - loss: 0.5022 - accuracy: 0.7638 - val_loss: 0.5202 - val_accuracy: 0.7462 Epoch 37/100 313/313 [==============================] - 21s 68ms/step - loss: 0.5006 - accuracy: 0.7654 - val_loss: 0.5208 - val_accuracy: 0.7482 Epoch 38/100 313/313 [==============================] - 23s 74ms/step - loss: 0.4992 - accuracy: 0.7649 - val_loss: 0.5210 - val_accuracy: 0.7468 Epoch 39/100 313/313 [==============================] - 23s 73ms/step - loss: 0.4982 - accuracy: 0.7659 - val_loss: 0.5201 - val_accuracy: 0.7442 Epoch 40/100 313/313 [==============================] - 22s 70ms/step - loss: 0.4959 - accuracy: 0.7671 - val_loss: 0.5180 - val_accuracy: 0.7446 Epoch 41/100 313/313 [==============================] - 23s 73ms/step - loss: 0.4950 - accuracy: 0.7684 - val_loss: 0.5162 - val_accuracy: 0.7470 Epoch 42/100 313/313 [==============================] - 31s 99ms/step - loss: 0.4942 - accuracy: 0.7686 - val_loss: 0.5164 - val_accuracy: 0.7480 Epoch 43/100 313/313 [==============================] - 24s 78ms/step - loss: 0.4931 - accuracy: 0.7696 - val_loss: 0.5152 - val_accuracy: 0.7484 Epoch 44/100 313/313 [==============================] - 23s 74ms/step - loss: 0.4919 - accuracy: 0.7692 - val_loss: 0.5139 - val_accuracy: 0.7460 Epoch 45/100 313/313 [==============================] - 23s 73ms/step - loss: 0.4916 - accuracy: 0.7699 - val_loss: 0.5148 - val_accuracy: 0.7478 Epoch 46/100 313/313 [==============================] - 24s 77ms/step - loss: 0.4902 - accuracy: 0.7703 - val_loss: 0.5157 - val_accuracy: 0.7458 Epoch 47/100 313/313 [==============================] - 22s 69ms/step - loss: 0.4892 - accuracy: 0.7714 - val_loss: 0.5147 - val_accuracy: 0.7496
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train', 'val'])
plt.show()
model2 = keras.Sequential()
model2.add(keras.layers.Embedding(300, 16, input_length=100))
model2.add(keras.layers.SimpleRNN(8))
model2.add(keras.layers.Dense(1, activation='sigmoid'))
model2.summary()
Model: "sequential_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= embedding (Embedding) (None, 100, 16) 4800 simple_rnn_1 (SimpleRNN) (None, 8) 200 dense_1 (Dense) (None, 1) 9 ================================================================= Total params: 5009 (19.57 KB) Trainable params: 5009 (19.57 KB) Non-trainable params: 0 (0.00 Byte) _________________________________________________________________
rmsprop = keras.optimizers.RMSprop(learning_rate=1e-4)
model2.compile(optimizer=rmsprop, loss='binary_crossentropy',
metrics=['accuracy'])
checkpoint_cb = keras.callbacks.ModelCheckpoint('best-embedding-model.h5',
save_best_only=True)
early_stopping_cb = keras.callbacks.EarlyStopping(patience=3,
restore_best_weights=True)
history = model2.fit(train_seq, train_target, epochs=100, batch_size=64,
validation_data=(val_seq, val_target),
callbacks=[checkpoint_cb, early_stopping_cb])
Epoch 1/100 313/313 [==============================] - 39s 119ms/step - loss: 0.6893 - accuracy: 0.5351 - val_loss: 0.6706 - val_accuracy: 0.5872 Epoch 2/100 313/313 [==============================] - 35s 112ms/step - loss: 0.6399 - accuracy: 0.6467 - val_loss: 0.6234 - val_accuracy: 0.6664 Epoch 3/100 313/313 [==============================] - 35s 112ms/step - loss: 0.6051 - accuracy: 0.6941 - val_loss: 0.6003 - val_accuracy: 0.6948 Epoch 4/100 313/313 [==============================] - 36s 114ms/step - loss: 0.5831 - accuracy: 0.7172 - val_loss: 0.5888 - val_accuracy: 0.7026 Epoch 5/100 313/313 [==============================] - 34s 110ms/step - loss: 0.5663 - accuracy: 0.7305 - val_loss: 0.5669 - val_accuracy: 0.7300 Epoch 6/100 313/313 [==============================] - 33s 107ms/step - loss: 0.5527 - accuracy: 0.7408 - val_loss: 0.5536 - val_accuracy: 0.7356 Epoch 7/100 313/313 [==============================] - 35s 112ms/step - loss: 0.5410 - accuracy: 0.7475 - val_loss: 0.5422 - val_accuracy: 0.7404 Epoch 8/100 313/313 [==============================] - 33s 105ms/step - loss: 0.5313 - accuracy: 0.7509 - val_loss: 0.5443 - val_accuracy: 0.7352 Epoch 9/100 313/313 [==============================] - 38s 120ms/step - loss: 0.5236 - accuracy: 0.7549 - val_loss: 0.5463 - val_accuracy: 0.7238 Epoch 10/100 313/313 [==============================] - 35s 112ms/step - loss: 0.5177 - accuracy: 0.7578 - val_loss: 0.5377 - val_accuracy: 0.7382 Epoch 11/100 313/313 [==============================] - 35s 111ms/step - loss: 0.5135 - accuracy: 0.7590 - val_loss: 0.5264 - val_accuracy: 0.7454 Epoch 12/100 313/313 [==============================] - 34s 108ms/step - loss: 0.5094 - accuracy: 0.7618 - val_loss: 0.5265 - val_accuracy: 0.7422 Epoch 13/100 313/313 [==============================] - 34s 109ms/step - loss: 0.5056 - accuracy: 0.7640 - val_loss: 0.5207 - val_accuracy: 0.7468 Epoch 14/100 313/313 [==============================] - 34s 107ms/step - loss: 0.5027 - accuracy: 0.7659 - val_loss: 0.5226 - val_accuracy: 0.7434 Epoch 15/100 313/313 [==============================] - 34s 110ms/step - loss: 0.4998 - accuracy: 0.7689 - val_loss: 0.5678 - val_accuracy: 0.6992 Epoch 16/100 313/313 [==============================] - 38s 121ms/step - loss: 0.4977 - accuracy: 0.7677 - val_loss: 0.5155 - val_accuracy: 0.7500 Epoch 17/100 313/313 [==============================] - 34s 110ms/step - loss: 0.4944 - accuracy: 0.7713 - val_loss: 0.5222 - val_accuracy: 0.7450 Epoch 18/100 313/313 [==============================] - 34s 108ms/step - loss: 0.4927 - accuracy: 0.7712 - val_loss: 0.5149 - val_accuracy: 0.7486 Epoch 19/100 313/313 [==============================] - 33s 106ms/step - loss: 0.4905 - accuracy: 0.7730 - val_loss: 0.5138 - val_accuracy: 0.7464 Epoch 20/100 313/313 [==============================] - 36s 114ms/step - loss: 0.4880 - accuracy: 0.7740 - val_loss: 0.5124 - val_accuracy: 0.7500 Epoch 21/100 313/313 [==============================] - 34s 108ms/step - loss: 0.4871 - accuracy: 0.7749 - val_loss: 0.5143 - val_accuracy: 0.7488 Epoch 22/100 313/313 [==============================] - 39s 125ms/step - loss: 0.4847 - accuracy: 0.7776 - val_loss: 0.5148 - val_accuracy: 0.7486 Epoch 23/100 313/313 [==============================] - 35s 112ms/step - loss: 0.4830 - accuracy: 0.7776 - val_loss: 0.5135 - val_accuracy: 0.7500
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.xlabel('epoch')
plt.ylabel('loss')
plt.legend(['train', 'val'])
plt.show()