Neural style transfer

This notebook is an assignment of fast.ai lesson8.

Neural Style Transfer is based on this paper.

Steps needed for Style Transfer using VGG:

Content extraction

  • Read the cont_image
  • Resize cont_image
  • Preprocess : RGB ->BGR and normalize
  • Create VGG_avg
  • Generate P(l) = activations for the cont_image at layer l
  • Generate F(l) = activations for white noise image at layer l
  • content_loss = MSE(P(l), F(l))

Style extraction

  • Read the style_image
  • Resize style_image
  • Preprocess : RGB ->BGR and normalize
  • Create VGG_avg
  • Generate Gram_matrix for original image, A(L) = Inner product of F * Ft for the layers L, where F is the vectorized feature map. (There is some weight to the loss for each layer?)
  • Generate Gram_matrix for white noise image, G(L) similarly above
  • style_loss = MSE(A(L), G(L))

Style transfer

  • loss(c,s,x) = a content_loss(c, x) + b style_loss(s, x), where c = content image, s = style image, x = generated image
  • Use scipy's implementation of L-BFGS2 to find the values of "x" that minimize the loss (fmin_l_bfgs_b(loss, x0=x, args=(c, s))). In our case, "x" happens to be image pixels, and thus we end up searching for the image that is close to both the content image (c) and the style image (s).

Setup

京都、嵐山の渡月橋。

In [1]:
%matplotlib inline
import importlib
import utils2; importlib.reload(utils2)
from utils2 import *

from scipy.optimize import fmin_l_bfgs_b
from scipy.misc import imsave
from keras import metrics

from vgg16_avg import VGG16_Avg
Using TensorFlow backend.
/home/ubuntu/anaconda3/lib/python3.6/site-packages/sklearn/cross_validation.py:44: DeprecationWarning: This module was deprecated in version 0.18 in favor of the model_selection module into which all the refactored classes and functions are moved. Also note that the interface of the new CV iterators are different from that of this module. This module will be removed in 0.20.
  "This module will be removed in 0.20.", DeprecationWarning)
In [2]:
# Tell Tensorflow to use no more GPU RAM than necessary
limit_mem()
In [4]:
path = '/home/ubuntu/cutting-edge-dl-for-coders-part2'
In [5]:
img=Image.open(path+'/data/bridge.jpg')
plt.imshow(np.array(img))
Out[5]:
<matplotlib.image.AxesImage at 0x7fea5406df28>
In [6]:
rn_mean = np.array([123.68, 116.779, 103.939], dtype=np.float32)
preproc = lambda x: (x - rn_mean)[:, :, :, ::-1]
In [7]:
deproc = lambda x,s: np.clip(x.reshape(s)[:, :, :, ::-1] + rn_mean, 0, 255)
In [8]:
img_arr = preproc(np.expand_dims(np.array(img), 0))
shp = img_arr.shape

Recreate Content

In [9]:
model = VGG16_Avg(include_top=False)
In [10]:
layer = model.get_layer('block5_conv1').output
In [11]:
layer_model = Model(model.input, layer)
targ = K.variable(layer_model.predict(img_arr))
In [12]:
class Evaluator(object):
    def __init__(self, f, shp): self.f, self.shp = f, shp
        
    def loss(self, x):
        loss_, self.grad_values = self.f([x.reshape(self.shp)])
        return loss_.astype(np.float64)

    def grads(self, x): return self.grad_values.flatten().astype(np.float64)
In [13]:
loss = metrics.mse(layer, targ)
grads = K.gradients(loss, model.input)
fn = K.function([model.input], [loss]+grads)
evaluator = Evaluator(fn, shp)
In [14]:
def solve_image(eval_obj, niter, x):
    for i in range(niter):
        x, min_val, info = fmin_l_bfgs_b(eval_obj.loss, x.flatten(),
                                         fprime=eval_obj.grads, maxfun=20)
        x = np.clip(x, -127,127)
        print('Current loss value:', min_val)
        imsave(f'{path}/results/res_at_iteration_{i}.png', deproc(x.copy(), shp)[0])
    return x
In [15]:
rand_img = lambda shape: np.random.uniform(-2.5, 2.5, shape)/100
x = rand_img(shp)
plt.imshow(x[0]);
In [16]:
iterations=10
In [18]:
x = solve_image(evaluator, iterations, x)
Current loss value: 48.2291946411
Current loss value: 11.814458847
Current loss value: 6.15314722061
Current loss value: 4.10418081284
Current loss value: 3.15371417999
Current loss value: 2.62931346893
Current loss value: 2.59162425995
Current loss value: 2.59218120575
Current loss value: 2.59218072891
Current loss value: 2.59218072891
In [19]:
Image.open(path + '/results/res_at_iteration_9.png')
Out[19]: