Good practices in Modern Tensorflow for NLP

In [1]:
__author__ = 'Guillaume Genthial'
__date__ = '2018-09-22'

Setup

In [2]:
from distutils.version import LooseVersion
import sys

if LooseVersion(sys.version) < LooseVersion('3.4'):
    raise Exception('You need python>=3.4, but you have {}'.format(sys.version))
In [3]:
# Standard
from pathlib import Path

# External
import numpy as np
import tensorflow as tf
In [4]:
if LooseVersion(tf.__version__) < LooseVersion('1.9'):
    raise Exception('You need tensorflow>=1.9, but you have {}'.format(tf.__version__))

Eager execution

Compatible with numpy (similar behavior to pyTorch). For a full review, see this notebook from the TensorFlow team.

It's a great tool for debugging and allowing dynamic graph building (if you really need it...).

In [5]:
# You need to activate it at program startup
tf.enable_eager_execution()
In [6]:
X = tf.random_normal([2, 4])
h = tf.layers.dense(X, 2, activation=tf.nn.relu)
y = tf.nn.softmax(h)
print(y)
print(y.numpy())
tf.Tensor(
[[0.43946627 0.5605337 ]
 [0.6169051  0.38309485]], shape=(2, 2), dtype=float32)
[[0.43946627 0.5605337 ]
 [0.6169051  0.38309485]]

Here, X, h, y are nodes of the computational graph. But you can actually get the value of these nodes!

In the past you would have written

X = tf.placeholder(dtype=tf.float32, shape=[2, 4])
h = tf.layers.dense(X, 2, activation=tf.nn.relu)
y = tf.nn.softmax(h)
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    sess.run(y, feed_dict={X: np.random.normal(size=[2, 4])})

tf.data: feeding data into the graph

tf.placeholders is replaced by tf.data.Dataset.

Placeholders (before)

x = tf.placeholder(dtype=tf.int32, shape=[None, 5])
with tf.Session() as sess:
    x_eval = sess.run(x, feed_dict={x: x_np})
    print(x_eval)

Dataset from np.array

Below is a simple example where we have a np.array, one row = one example.

In [7]:
x_np = np.array([[i]*5 for i in range(10)])
x_np
Out[7]:
array([[0, 0, 0, 0, 0],
       [1, 1, 1, 1, 1],
       [2, 2, 2, 2, 2],
       [3, 3, 3, 3, 3],
       [4, 4, 4, 4, 4],
       [5, 5, 5, 5, 5],
       [6, 6, 6, 6, 6],
       [7, 7, 7, 7, 7],
       [8, 8, 8, 8, 8],
       [9, 9, 9, 9, 9]])

We create a Dataset from this array.

This dataset is a node of the graph. Each time you query its value, it will move to the next row of the underlying np.array.

In [8]:
dataset = tf.data.Dataset.from_tensor_slices(x_np)
In [9]:
for el in dataset:
    print(el)
tf.Tensor([0 0 0 0 0], shape=(5,), dtype=int64)
tf.Tensor([1 1 1 1 1], shape=(5,), dtype=int64)
tf.Tensor([2 2 2 2 2], shape=(5,), dtype=int64)
tf.Tensor([3 3 3 3 3], shape=(5,), dtype=int64)
tf.Tensor([4 4 4 4 4], shape=(5,), dtype=int64)
tf.Tensor([5 5 5 5 5], shape=(5,), dtype=int64)
tf.Tensor([6 6 6 6 6], shape=(5,), dtype=int64)
tf.Tensor([7 7 7 7 7], shape=(5,), dtype=int64)
tf.Tensor([8 8 8 8 8], shape=(5,), dtype=int64)
tf.Tensor([9 9 9 9 9], shape=(5,), dtype=int64)

el is the equivalent of the former tf.placeholder. It's a node of the graph, to which you can apply any Tensorflow operations.

Dataset from text file

Let's just display the content of the file.

In [10]:
path = 'test.txt'
with Path(path).open() as f:
    print(f.read())
Hello world 1
Hello world 2
Hello world 3
Hello world 4
Hello world 5
Hello world 6
Hello world 7
Hello world 8
Hello world 9
Hello world 1 2 3

The following does just the same as above, but now el is a tf.Tensor of dtype=tf.string!

In [11]:
dataset = tf.data.TextLineDataset([path])
for el in dataset:
    print(el)
tf.Tensor(b'Hello world 1', shape=(), dtype=string)
tf.Tensor(b'Hello world 2', shape=(), dtype=string)
tf.Tensor(b'Hello world 3', shape=(), dtype=string)
tf.Tensor(b'Hello world 4', shape=(), dtype=string)
tf.Tensor(b'Hello world 5', shape=(), dtype=string)
tf.Tensor(b'Hello world 6', shape=(), dtype=string)
tf.Tensor(b'Hello world 7', shape=(), dtype=string)
tf.Tensor(b'Hello world 8', shape=(), dtype=string)
tf.Tensor(b'Hello world 9', shape=(), dtype=string)
tf.Tensor(b'Hello world 1 2 3', shape=(), dtype=string)

Dataset from custom generator

The best of both worlds, perfect for NLP

It will allow you do put all your logic in pure python, in your generator_fn, before feeding it to the Graph.

In [12]:
def generator_fn():
    for _ in range(2):
        yield b'Hello world'
In [13]:
dataset = (tf.data.Dataset.from_generator(
    generator_fn, 
    output_types=(tf.string),  # Define type and shape of your generator_fn output
    output_shapes=()))         # like you would have for your `placeholders`
In [14]:
for el in dataset:
    print(el)
tf.Tensor(b'Hello world', shape=(), dtype=string)
tf.Tensor(b'Hello world', shape=(), dtype=string)

tf.data: Dataset Transforms

Shuffle

Note: the buffer_size is the number of elements you load in the RAM before starting to sample from it. If it's too small (1 is no shuffling at all), it won't be efficient. Ideally, your buffer_size is the same as the number of elements in your dataset. But because not all datasets fit in RAM, you need to be able to set it manually.

In [15]:
dataset = dataset.shuffle(buffer_size=10)
In [16]:
for el in dataset:
    print(el)
tf.Tensor(b'Hello world', shape=(), dtype=string)
tf.Tensor(b'Hello world', shape=(), dtype=string)

Repeat

Repeat your dataset to perform multiple epochs!

In [17]:
dataset = dataset.repeat(2)  # 2 epochs
In [18]:
for el in dataset:
    print(el)
tf.Tensor(b'Hello world', shape=(), dtype=string)
tf.Tensor(b'Hello world', shape=(), dtype=string)
tf.Tensor(b'Hello world', shape=(), dtype=string)
tf.Tensor(b'Hello world', shape=(), dtype=string)

Map

Note: while map is super handy when working with images (TensorFlow has a lot of image preprocessing functions and efficiency is crucial), it's not as practical for NLP, because you're now working with tensors. We found it easier to write the most of the preprocessing logic in python, in a generator_fn, before feeding it to the graph.

In [19]:
dataset = dataset.map(
    lambda t: tf.string_split([t], delimiter=' ').values, 
    num_parallel_calls=4)  # Multithreading
In [20]:
for el in dataset:
    print(el)
tf.Tensor([b'Hello' b'world'], shape=(2,), dtype=string)
tf.Tensor([b'Hello' b'world'], shape=(2,), dtype=string)
tf.Tensor([b'Hello' b'world'], shape=(2,), dtype=string)
tf.Tensor([b'Hello' b'world'], shape=(2,), dtype=string)

Batch

In [21]:
dataset = dataset.batch(batch_size=3)
In [22]:
for el in dataset:
    print(el)
tf.Tensor(
[[b'Hello' b'world']
 [b'Hello' b'world']
 [b'Hello' b'world']], shape=(3, 2), dtype=string)
tf.Tensor([[b'Hello' b'world']], shape=(1, 2), dtype=string)

Padded batch

In NLP, we usually work with sentences of different length. When building your batch, we need to 'pad', i.e. add some fake elements at the end of the shorter sentences. You can perform this operation easily in TensorFlow.

Here is a dummy example:

In [23]:
def generator_fn():
    yield [1, 2]
    yield [1, 2, 3]

dataset = tf.data.Dataset.from_generator(
    generator_fn, 
    output_types=(tf.int32), 
    output_shapes=([None]))
In [24]:
dataset = dataset.padded_batch(
    batch_size=2, 
    padded_shapes=([None]), 
    padding_values=(4))  # Optional, if not set will default to 0
In [25]:
for el in dataset:
    print(el)
tf.Tensor(
[[1 2 4]
 [1 2 3]], shape=(2, 3), dtype=int32)

Notice that a 4 has been appended at the end of the first row.

And much more: prefetch, zip, concatenate, skip, take etc. See the documentation.

Note: the recommended standard workflow is

  1. shuffle
  2. repeat (repeat after shuffle so that one epoch = all the examples)
  3. map, using the num_parallel_calls argument to get multithreading for free.
  4. batch or padded_batch
  5. prefetch (will prefetch data on the GPU so that it doesn't suffer from any data starvation – and only use 80% of your expensive GPU).

NLP: preprocessing in Tensorflow

Tokenizing by white space in TensorFlow

This is an example of why using map is kind of annoying in NLP. It works, but it's not as easy as just using .split() or any other python code.

In [26]:
def tf_tokenize(t):
    return tf.string_split([t], delimiter=' ').values
In [27]:
dataset = tf.data.TextLineDataset(['test.txt'])
dataset = dataset.map(tf_tokenize)
for el in dataset:
    print(el)
tf.Tensor([b'Hello' b'world' b'1'], shape=(3,), dtype=string)
tf.Tensor([b'Hello' b'world' b'2'], shape=(3,), dtype=string)
tf.Tensor([b'Hello' b'world' b'3'], shape=(3,), dtype=string)
tf.Tensor([b'Hello' b'world' b'4'], shape=(3,), dtype=string)
tf.Tensor([b'Hello' b'world' b'5'], shape=(3,), dtype=string)
tf.Tensor([b'Hello' b'world' b'6'], shape=(3,), dtype=string)
tf.Tensor([b'Hello' b'world' b'7'], shape=(3,), dtype=string)
tf.Tensor([b'Hello' b'world' b'8'], shape=(3,), dtype=string)
tf.Tensor([b'Hello' b'world' b'9'], shape=(3,), dtype=string)
tf.Tensor([b'Hello' b'world' b'1' b'2' b'3'], shape=(5,), dtype=string)

Lookup token index from vocab file in TensorFlow

You're probably used to performing the lookup token -> token_idx outside TensorFlow. However, tf.contrib.lookup provides exactly this functionality. It's fast, and when exporting the model for serving, it will consider your vocab.txt file as a model's resource and keep it with the model!

In [28]:
# One lexeme per line
path_vocab = 'vocab.txt'
with Path(path_vocab).open() as f:
    for idx, line in enumerate(f):
        print(idx, ' -> ', line.strip())
0  ->  Hello
1  ->  world

To use it in TensorFlow:

In [29]:
# The last idx (2) will be used for unknown words
lookup_table = tf.contrib.lookup.index_table_from_file(
    path_vocab, num_oov_buckets=1)
In [30]:
for el in dataset:
    print(lookup_table.lookup(el))
tf.Tensor([0 1 2], shape=(3,), dtype=int64)
tf.Tensor([0 1 2], shape=(3,), dtype=int64)
tf.Tensor([0 1 2], shape=(3,), dtype=int64)
tf.Tensor([0 1 2], shape=(3,), dtype=int64)
tf.Tensor([0 1 2], shape=(3,), dtype=int64)
tf.Tensor([0 1 2], shape=(3,), dtype=int64)
tf.Tensor([0 1 2], shape=(3,), dtype=int64)
tf.Tensor([0 1 2], shape=(3,), dtype=int64)
tf.Tensor([0 1 2], shape=(3,), dtype=int64)
tf.Tensor([0 1 2 2 2], shape=(5,), dtype=int64)

Full Example

Task and Data

The tokens_generator returns list of ids. We map even/odd numbers to 2 different ids.

The labels_generator returns list of label ids. We want to predict if a token is

  • a word (label 0)
  • an odd number (label 1)
  • an even number (label 2)
In [31]:
# We tokenize by white space and assign these ids
tok_to_idx = {'hello': 0, 'world': 1, '<odd>': 2, '<even>': 3}

def tokens_generator():
    with Path(path).open() as f:
        for line in f:
            # Tokenize by white space
            tokens = line.strip().split()
            token_ids = []
            for tok in tokens:
                # Look for digits
                if tok.isdigit():
                    if int(tok) % 2 == 0:
                        tok = '<even>'
                    else:
                        tok = '<odd>'
                token_ids.append(tok_to_idx.get(tok.lower(), len(tok_to_idx)))
            yield (token_ids, len(token_ids))
            
def get_label(token_id):
    if token_id == 2:
        return 1
    elif token_id == 3:
        return 2
    else:
        return 0  
            
def labels_generator():
    for token_ids, _ in tokens_generator():
        yield [get_label(tok_id) for tok_id in token_ids]
In [32]:
dataset = tf.data.Dataset.from_generator(
    tokens_generator, 
    output_types=(tf.int32, tf.int32), 
    output_shapes=([None], ()))
for el in dataset:
    print(el)
(<tf.Tensor: id=320, shape=(3,), dtype=int32, numpy=array([0, 1, 2], dtype=int32)>, <tf.Tensor: id=321, shape=(), dtype=int32, numpy=3>)
(<tf.Tensor: id=324, shape=(3,), dtype=int32, numpy=array([0, 1, 3], dtype=int32)>, <tf.Tensor: id=325, shape=(), dtype=int32, numpy=3>)
(<tf.Tensor: id=328, shape=(3,), dtype=int32, numpy=array([0, 1, 2], dtype=int32)>, <tf.Tensor: id=329, shape=(), dtype=int32, numpy=3>)
(<tf.Tensor: id=332, shape=(3,), dtype=int32, numpy=array([0, 1, 3], dtype=int32)>, <tf.Tensor: id=333, shape=(), dtype=int32, numpy=3>)
(<tf.Tensor: id=336, shape=(3,), dtype=int32, numpy=array([0, 1, 2], dtype=int32)>, <tf.Tensor: id=337, shape=(), dtype=int32, numpy=3>)
(<tf.Tensor: id=340, shape=(3,), dtype=int32, numpy=array([0, 1, 3], dtype=int32)>, <tf.Tensor: id=341, shape=(), dtype=int32, numpy=3>)
(<tf.Tensor: id=344, shape=(3,), dtype=int32, numpy=array([0, 1, 2], dtype=int32)>, <tf.Tensor: id=345, shape=(), dtype=int32, numpy=3>)
(<tf.Tensor: id=348, shape=(3,), dtype=int32, numpy=array([0, 1, 3], dtype=int32)>, <tf.Tensor: id=349, shape=(), dtype=int32, numpy=3>)
(<tf.Tensor: id=352, shape=(3,), dtype=int32, numpy=array([0, 1, 2], dtype=int32)>, <tf.Tensor: id=353, shape=(), dtype=int32, numpy=3>)
(<tf.Tensor: id=356, shape=(5,), dtype=int32, numpy=array([0, 1, 2, 3, 2], dtype=int32)>, <tf.Tensor: id=357, shape=(), dtype=int32, numpy=5>)

Graph (test with eager execution)

Let's build a model that predicts the classes 0, 1 and 2 above.

Test our graph logic here, with eager_execution activated.

In [33]:
batch_size = 4
vocab_size = 4
dim = 100
In [34]:
shapes = ([None], ())
defaults = (0, 0)
# The last sentence is longer: need padding
dataset = dataset.padded_batch(   
    batch_size, shapes, defaults)
In [35]:
# Define all variables (In eager execution mode, have to be done just once)
# Otherwise you would create new variable at each loop iteration!
embeddings = tf.get_variable('embeddings', shape=[vocab_size, dim])
lstm_cell = tf.contrib.rnn.LSTMCell(100)
dense_layer = tf.layers.Dense(3, activation=tf.nn.relu)
In [36]:
for tokens, sequence_length in dataset:
    token_embeddings = tf.nn.embedding_lookup(embeddings, tokens)
    lstm_output, _ = tf.nn.dynamic_rnn(
        lstm_cell, token_embeddings, dtype=tf.float32, sequence_length=sequence_length)
    logits = dense_layer(lstm_output)
    print(logits.shape)
(4, 3, 3)
(4, 3, 3)
(2, 5, 3)

No error/cryptic messages about some shape mismatch – seems like our TensorFlow logic is fine.

Model (tf.estimator)

tf.estimator uses the traditional graph-based environment (no eager execution).

If you use the tf.estimator interface, you will get for free :

  1. Tensorboard
  2. Weights serialization
  3. Logging
  4. Model export for serving
  5. Unified structure compatible with open-source code

Before: custom model classes

People used to write custom model classes

class Model:

    def get_feed_dict(self, X, y):
        return {self.X: X, self.y: y}

    def build(self):
        do_stuff()

    def train(self, X, y):
        with tf.Session() as sess:
            do_some_training()

Now: tf.estimator

Now there is a common interface for models.

def input_fn():
    # Return a tf.data.Dataset that yields a tuple features, labels
    return dataset

def model_fn(features, labels, mode, params):
    """
    Parameters
    ----------
    features: tf.Tensor or nested structure
        Returned by `input_fn`
    labels: tf.Tensor of nested structure
        Returned by `input_fn`
    mode: tf.estimator.ModeKeys
        Either PREDICT / EVAL / TRAIN
    params: dict
        Hyperparams

    Returns
    -------
    tf.estimator.EstimatorSpec
    """
    if mode == tf.estimator.ModeKeys.TRAIN:
        return tf.estimator.EstimatorSpec(mode, loss=loss, train_op=train_op)
    elif mode == ...
        ...

estimator =  tf.estimator.Estimator(
    model_fn=model_fn, params=params)
estimator.train(input_fn)
In [37]:
# Clear all the objects we defined above, to be sure 
# we don't mess with anything
tf.reset_default_graph() 

input_fn

A callable that returns a dataset that yields tuples of features, labels

In [38]:
def input_fn():
    # Create datasets for features and labels
    dataset_tokens = tf.data.Dataset.from_generator(
        tokens_generator, 
        output_types=(tf.int32, tf.int32), 
        output_shapes=([None], ()))
    dataset_output = tf.data.Dataset.from_generator(
        labels_generator, 
        output_types=(tf.int32), 
        output_shapes=([None]))
    
    # Zip features and labels in one Dataset
    dataset = tf.data.Dataset.zip((dataset_tokens, dataset_output))
        
    # Shuffle, repeat, batch and prefetch
    shapes = (([None], ()), [None])
    defaults = ((0, 0), 0)
    dataset = (dataset
               .shuffle(10)
               .repeat(100)
               .padded_batch(4, shapes, defaults)
               .prefetch(1))

    # Dataset yields tuple of features, labels
    return dataset

model_fn

Inputs (features, labels, mode, params); returns EstimatorSpec objects.

In [39]:
def model_fn(features, labels, mode, params):
    # Args features and labels are the same as returned by the dataset
    tokens, sequence_length = features
    
    # For Serving (ignore this)
    if isinstance(features, dict):
        tokens = features['tokens']
        sequence_length = features['sequence_length']
    
    # 1. Define the graph
    vocab_size = params['vocab_size']
    dim = params['dim']
    embeddings = tf.get_variable('embeddings', shape=[vocab_size, dim])
    token_embeddings = tf.nn.embedding_lookup(embeddings, tokens)
    lstm_cell = tf.contrib.rnn.LSTMCell(20)
    lstm_output, _ = tf.nn.dynamic_rnn(
        lstm_cell, token_embeddings, dtype=tf.float32)
    
    logits = tf.layers.dense(lstm_output, 3)
    preds = tf.argmax(logits, axis=-1)
    
    # 2. Define EstimatorSpecs for PREDICT
    if mode == tf.estimator.ModeKeys.PREDICT:
        # Predictions is any nested object (dict is convenient)
        predictions = {'logits': logits, 'preds': preds}
        # export_outputs is for serving (ignore this)
        export_outputs = {
            'predictions': tf.estimator.export.PredictOutput(predictions)}
        return tf.estimator.EstimatorSpec(mode, predictions=predictions, 
                                          export_outputs=export_outputs)
    else:
        # 3. Define loss and metrics
        # Define weights to account for padding
        weights = tf.sequence_mask(sequence_length)
        loss = tf.losses.sparse_softmax_cross_entropy(
            logits=logits, labels=labels, weights=weights)
        metrics = {
            'accuracy': tf.metrics.accuracy(labels=labels, predictions=preds),
        }
        # For Tensorboard
        for k, v in metrics.items():
            # v[1] is the update op of the metrics object
            tf.summary.scalar(k, v[1])
    
        # 4. Define EstimatorSpecs for EVAL
        # Having an eval mode and metrics in Tensorflow allows you to use
        # built-in early stopping (see later)
        if mode == tf.estimator.ModeKeys.EVAL:
            return tf.estimator.EstimatorSpec(mode, loss=loss, 
                                              eval_metric_ops=metrics)
            
        # 5. Define EstimatorSpecs for TRAIN
        elif mode == tf.estimator.ModeKeys.TRAIN:
            global_step = tf.train.get_or_create_global_step()
            train_op = (tf.train.AdamOptimizer(learning_rate=0.1)
                        .minimize(loss, global_step=global_step))
            return tf.estimator.EstimatorSpec(mode, loss=loss, 
                                              train_op=train_op)

What do you think about this model_fn? It seems like we wrote only things that matter (not a lot of boilerplate!)

Instantiate and train your Estimator

Now, let's define our estimator and train it!

In [40]:
params = {
    'vocab_size': 4,
    'dim': 3
}
estimator = tf.estimator.Estimator(
    model_fn=model_fn,
    model_dir='model',  # Will save the weights here automatically
    params=params)
INFO:tensorflow:Using default config.
INFO:tensorflow:Using config: {'_session_config': None, '_task_id': 0, '_save_summary_steps': 100, '_tf_random_seed': None, '_service': None, '_log_step_count_steps': 100, '_keep_checkpoint_every_n_hours': 10000, '_task_type': 'worker', '_num_worker_replicas': 1, '_cluster_spec': <tensorflow.python.training.server_lib.ClusterSpec object at 0x121fe44a8>, '_num_ps_replicas': 0, '_save_checkpoints_secs': 600, '_save_checkpoints_steps': None, '_keep_checkpoint_max': 5, '_train_distribute': None, '_device_fn': None, '_master': '', '_is_chief': True, '_evaluation_master': '', '_global_id_in_cluster': 0, '_model_dir': 'model'}
In [41]:
estimator.train(input_fn)
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from model/model.ckpt-2000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 2000 into model/model.ckpt.
INFO:tensorflow:loss = 5.960464e-08, step = 2001
INFO:tensorflow:global_step/sec: 179.608
INFO:tensorflow:loss = 6.953875e-08, step = 2101 (0.558 sec)
INFO:tensorflow:global_step/sec: 274.294
INFO:tensorflow:loss = 5.960464e-08, step = 2201 (0.364 sec)
INFO:tensorflow:Saving checkpoints for 2250 into model/model.ckpt.
INFO:tensorflow:Loss for final step: 1.8732878e-07.
Out[41]:
<tensorflow.python.estimator.estimator.Estimator at 0x1169f0e80>

TensorBoard, train_and_evaluate, predict etc.

Now, the estimator is trained, serialized to disk etc. You also have access to TensorBoard. (Lots of stuff for free, without having to write boilerplate code!)

To access tensorboard :

tensorboard --logdir model

Check evaluate, train_and_evaluate... documentation.

Example with early stopping, where we run evaluation every 2 minutes (120 seconds).

hook = tf.contrib.estimator.stop_if_no_increase_hook(
        estimator, 'accuracy', 500, min_steps=8000, run_every_secs=120)
train_spec = tf.estimator.TrainSpec(input_fn=input_fn, hooks=[hook])
eval_spec = tf.estimator.EvalSpec(input_fn=input_fn, throttle_secs=120)
tf.estimator.train_and_evaluate(estimator, train_spec, eval_spec)
In [42]:
# Iterate over the 2 first elements of the (shuffled) dataset and yield predictions
# You need to write variants of your input_fn for eval / predict modes
for idx, predictions in enumerate(estimator.predict(input_fn)):
    print(predictions['preds'])
    if idx > 0:
        break
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from model/model.ckpt-2250
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
[0 0 1 2 0]
[0 0 1 2 1]

A word about TensorFlow model serving

Exporting an inference graph and the serving signature is "simple" (though the serving_fn interface could be improved). The cool thing is that once you have your tf.estimator and your serving_input_fn, you can just use tensorflow_serving and get a RESTful API serving your model!

Serving interface

In [43]:
def serving_input_fn():
    tokens = tf.placeholder(
        dtype=tf.int32, shape=[None, None], name="tokens")
    sequence_length = tf.size(tokens)
    features = {'tokens': tokens, 'sequence_length': sequence_length}
    return tf.estimator.export.ServingInputReceiver(
        features=features, receiver_tensors=tokens)
In [44]:
estimator.export_savedmodel('export', serving_input_fn)
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Signatures INCLUDED in export for Predict: ['predictions', 'serving_default']
INFO:tensorflow:Signatures INCLUDED in export for Regress: None
INFO:tensorflow:Signatures INCLUDED in export for Train: None
INFO:tensorflow:Signatures INCLUDED in export for Classify: None
INFO:tensorflow:Signatures INCLUDED in export for Eval: None
INFO:tensorflow:Restoring parameters from model/model.ckpt-2250
INFO:tensorflow:Assets added to graph.
INFO:tensorflow:No assets to write.
INFO:tensorflow:SavedModel written to: export/temp-b'1537767208'/saved_model.pb
Out[44]:
b'export/1537767208'

Docker Image

Pull existing image

docker pull tensorflow/serving

Run

docker run -p 8501:8501 \
--mount type=bind,\
source=path_to_your_export_model,\
target=/models/dummy \
-e MODEL_NAME=dummy -t tensorflow/serving &

Rest API POST with curl

curl -d '{"instances": [[0, 1, 2],[0, 1, 3]]}' -X POST \
http://localhost:8501/v1/models/dummy:predict