[www.joyofdata.de](http://blog.joyofdata.de) - [@joyofdata](https://twitter.com/joyofdata) - [github.com/joyofdata](https://github.com/joyofdata)

More information you will find here: [Neural Networks with Caffe on the GPU](http://www.joyofdata.de/blog/neural-networks-with-caffe-on-the-gpu)


Training Multi-Layer Neural Network with Caffe

In [2]:
import subprocess
import platform

sys.path.append("/home/ubuntu/caffe/python/")
import caffe
caffe.set_mode_gpu()
import lmdb

from sklearn.cross_validation import StratifiedShuffleSplit
import pandas as pd
import numpy as np

import matplotlib.pyplot as plt
%matplotlib inline

Status Quo

In [3]:
print "OS:     ", platform.platform()
print "Python: ", sys.version.split("\n")[0]
print "CUDA:   ", subprocess.Popen(["nvcc","--version"], stdout=subprocess.PIPE).communicate()[0].split("\n")[3]
print "LMDB:   ", ".".join([str(i) for i in lmdb.version()])
OS:      Linux-3.13.0-49-generic-x86_64-with-Ubuntu-14.04-trusty
Python:  2.7.6 (default, Mar 22 2014, 22:59:56) 
CUDA:    Cuda compilation tools, release 7.0, V7.0.27
LMDB:    0.9.14

Load Data from CSV and Trasform

The CSV is assumed to be the training data from the "Otto Group Product Classification Challenge" at Kaggle. It contains 95 columns:

  • [0] id (discarded)
  • [1..93] features (integer values)
  • [94] label (9 categories - Class_1,..,Class_9)
In [4]:
df = pd.read_csv("train.csv", sep=",")
features = df.ix[:,1:-1].as_matrix()
labels = df.ix[:,-1].as_matrix()
In [5]:
vec_log = numpy.vectorize(lambda x: log(x+1))
vec_int = numpy.vectorize(lambda str: int(str[-1])-1)
In [6]:
features = vec_log(features)
labels = vec_int(labels)

Stratified Split for Training and Testing

In [7]:
sss = StratifiedShuffleSplit(labels, 1, test_size=0.02, random_state=0)
sss = list(sss)[0]
In [8]:
features_training = features[sss[0],]
labels_training = labels[sss[0],]

features_testing = features[sss[1],]
labels_testing = labels[sss[1],]

Load Data into LMDB

In [9]:
# http://deepdish.io/2015/04/28/creating-lmdb-in-python/
def load_data_into_lmdb(lmdb_name, features, labels=None):
    env = lmdb.open(lmdb_name, map_size=features.nbytes*2)
    
    features = features[:,:,None,None]
    for i in range(features.shape[0]):
        datum = caffe.proto.caffe_pb2.Datum()
        
        datum.channels = features.shape[1]
        datum.height = 1
        datum.width = 1
        
        if features.dtype == np.int:
            datum.data = features[i].tostring()
        elif features.dtype == np.float: 
            datum.float_data.extend(features[i].flat)
        else:
            raise Exception("features.dtype unknown.")
        
        if labels is not None:
            datum.label = int(labels[i])
        
        str_id = '{:08}'.format(i)
        with env.begin(write=True) as txn:
            txn.put(str_id, datum.SerializeToString())
In [10]:
load_data_into_lmdb("/home/ubuntu/data/train_data_lmdb", features_training, labels_training)
load_data_into_lmdb("/home/ubuntu/data/test_data_lmdb", features_testing, labels_testing)

Check Content of LMDB

In [11]:
# http://research.beenfrog.com/code/2015/03/28/read-leveldb-lmdb-for-caffe-with-python.html
def get_data_for_case_from_lmdb(lmdb_name, id):
    lmdb_env = lmdb.open(lmdb_name, readonly=True)
    lmdb_txn = lmdb_env.begin()

    raw_datum = lmdb_txn.get(id)
    datum = caffe.proto.caffe_pb2.Datum()
    datum.ParseFromString(raw_datum)

    feature = caffe.io.datum_to_array(datum)
    label = datum.label

    return (label, feature)
In [12]:
get_data_for_case_from_lmdb("/home/ubuntu/data/train_data_lmdb/", "00012345")
Out[12]:
(2, array([[[ 0.        ]],
 
        [[ 0.        ]],
 
        [[ 0.        ]],
 
        [[ 0.        ]],
 
        [[ 0.        ]],
 
        [[ 0.69314718]],
 
        [[ 0.69314718]],
 
        [ ... ],
 
        [[ 0.        ]],
 
        [[ 0.69314718]],
 
        [[ 0.        ]]]))

Training the Model

In [13]:
proc = subprocess.Popen(
    ["/home/ubuntu/caffe/build/tools/caffe","train","--solver=config.prototxt"], 
    stderr=subprocess.PIPE)
res = proc.communicate()[1]

# http://nbviewer.ipython.org/github/BVLC/caffe/blob/master/examples/hdf5_classification.ipynb
# or
# caffe.set_mode_gpu()
# solver = caffe.get_solver("config.prototxt")
# solver.solve()
In [14]:
print res
libdc1394 error: Failed to initialize libdc1394
I0508 12:19:01.337918  2310 caffe.cpp:113] Use GPU with device ID 0
I0508 12:19:01.502816  2310 caffe.cpp:121] Starting Optimization
I0508 12:19:01.502984  2310 solver.cpp:32] Initializing solver from parameters: 
test_iter: 100
test_interval: 10000
base_lr: 0.01
display: 10000
max_iter: 100000
lr_policy: "inv"
gamma: 0.0001
power: 0.75
momentum: 0.9
weight_decay: 0.0005
solver_mode: GPU
net: "model_train_test.prototxt"
I0508 12:19:01.503021  2310 solver.cpp:70] Creating training net from net file: model_train_test.prototxt
I0508 12:19:01.503921  2310 net.cpp:257] The NetState phase (0) differed from the phase (1) specified by a rule in layer simple
I0508 12:19:01.503988  2310 net.cpp:257] The NetState phase (0) differed from the phase (1) specified by a rule in layer accuracy
I0508 12:19:01.504057  2310 net.cpp:42] Initializing net from parameters: 
name: "otto"
state {
  phase: TRAIN
}
layer {
  name: "otto"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TRAIN
  }
  data_param {
    source: "train_data_lmdb"
    batch_size: 64
    backend: LMDB
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "data"
  top: "ip1"
  inner_product_param {
    num_output: 30
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  inner_product_param {
    num_output: 9
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}
I0508 12:19:01.504134  2310 layer_factory.hpp:74] Creating layer otto
I0508 12:19:01.505414  2310 net.cpp:84] Creating Layer otto
I0508 12:19:01.505437  2310 net.cpp:338] otto -> data
I0508 12:19:01.505470  2310 net.cpp:338] otto -> label
I0508 12:19:01.505489  2310 net.cpp:113] Setting up otto
I0508 12:19:01.505575  2310 db.cpp:34] Opened lmdb train_data_lmdb
I0508 12:19:01.505663  2310 data_layer.cpp:67] output data size: 64,93,1,1
I0508 12:19:01.505764  2310 net.cpp:120] Top shape: 64 93 1 1 (5952)
I0508 12:19:01.505779  2310 net.cpp:120] Top shape: 64 (64)
I0508 12:19:01.505791  2310 layer_factory.hpp:74] Creating layer ip1
I0508 12:19:01.505807  2310 net.cpp:84] Creating Layer ip1
I0508 12:19:01.505818  2310 net.cpp:380] ip1 <- data
I0508 12:19:01.505838  2310 net.cpp:338] ip1 -> ip1
I0508 12:19:01.505857  2310 net.cpp:113] Setting up ip1
I0508 12:19:01.506310  2310 net.cpp:120] Top shape: 64 30 (1920)
I0508 12:19:01.506335  2310 layer_factory.hpp:74] Creating layer relu1
I0508 12:19:01.506350  2310 net.cpp:84] Creating Layer relu1
I0508 12:19:01.506360  2310 net.cpp:380] relu1 <- ip1
I0508 12:19:01.506367  2310 net.cpp:327] relu1 -> ip1 (in-place)
I0508 12:19:01.506378  2310 net.cpp:113] Setting up relu1
I0508 12:19:04.415382  2310 net.cpp:120] Top shape: 64 30 (1920)
I0508 12:19:04.415416  2310 layer_factory.hpp:74] Creating layer ip2
I0508 12:19:04.415432  2310 net.cpp:84] Creating Layer ip2
I0508 12:19:04.415446  2310 net.cpp:380] ip2 <- ip1
I0508 12:19:04.415457  2310 net.cpp:338] ip2 -> ip2
I0508 12:19:04.415477  2310 net.cpp:113] Setting up ip2
I0508 12:19:04.415506  2310 net.cpp:120] Top shape: 64 9 (576)
I0508 12:19:04.415524  2310 layer_factory.hpp:74] Creating layer loss
I0508 12:19:04.416088  2310 net.cpp:84] Creating Layer loss
I0508 12:19:04.416102  2310 net.cpp:380] loss <- ip2
I0508 12:19:04.416110  2310 net.cpp:380] loss <- label
I0508 12:19:04.416121  2310 net.cpp:338] loss -> loss
I0508 12:19:04.416137  2310 net.cpp:113] Setting up loss
I0508 12:19:04.416153  2310 layer_factory.hpp:74] Creating layer loss
I0508 12:19:04.416234  2310 net.cpp:120] Top shape: (1)
I0508 12:19:04.416247  2310 net.cpp:122]     with loss weight 1
I0508 12:19:04.416277  2310 net.cpp:167] loss needs backward computation.
I0508 12:19:04.416283  2310 net.cpp:167] ip2 needs backward computation.
I0508 12:19:04.416293  2310 net.cpp:167] relu1 needs backward computation.
I0508 12:19:04.416298  2310 net.cpp:167] ip1 needs backward computation.
I0508 12:19:04.416331  2310 net.cpp:169] otto does not need backward computation.
I0508 12:19:04.416337  2310 net.cpp:205] This network produces output loss
I0508 12:19:04.416350  2310 net.cpp:447] Collecting Learning Rate and Weight Decay.
I0508 12:19:04.416360  2310 net.cpp:217] Network initialization done.
I0508 12:19:04.416365  2310 net.cpp:218] Memory required for data: 41732
I0508 12:19:04.416591  2310 solver.cpp:154] Creating test net (#0) specified by net file: model_train_test.prototxt
I0508 12:19:04.416620  2310 net.cpp:257] The NetState phase (1) differed from the phase (0) specified by a rule in layer otto
I0508 12:19:04.416688  2310 net.cpp:42] Initializing net from parameters: 
name: "otto"
state {
  phase: TEST
}
layer {
  name: "simple"
  type: "Data"
  top: "data"
  top: "label"
  include {
    phase: TEST
  }
  data_param {
    source: "test_data_lmdb"
    batch_size: 100
    backend: LMDB
  }
}
layer {
  name: "ip1"
  type: "InnerProduct"
  bottom: "data"
  top: "ip1"
  inner_product_param {
    num_output: 30
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "relu1"
  type: "ReLU"
  bottom: "ip1"
  top: "ip1"
}
layer {
  name: "ip2"
  type: "InnerProduct"
  bottom: "ip1"
  top: "ip2"
  inner_product_param {
    num_output: 9
    weight_filler {
      type: "xavier"
    }
    bias_filler {
      type: "constant"
      value: 0
    }
  }
}
layer {
  name: "accuracy"
  type: "Accuracy"
  bottom: "ip2"
  bottom: "label"
  top: "accuracy"
  include {
    phase: TEST
  }
}
layer {
  name: "loss"
  type: "SoftmaxWithLoss"
  bottom: "ip2"
  bottom: "label"
  top: "loss"
}
I0508 12:19:04.416759  2310 layer_factory.hpp:74] Creating layer simple
I0508 12:19:04.416776  2310 net.cpp:84] Creating Layer simple
I0508 12:19:04.416787  2310 net.cpp:338] simple -> data
I0508 12:19:04.416798  2310 net.cpp:338] simple -> label
I0508 12:19:04.416811  2310 net.cpp:113] Setting up simple
I0508 12:19:04.416862  2310 db.cpp:34] Opened lmdb test_data_lmdb
I0508 12:19:04.416890  2310 data_layer.cpp:67] output data size: 100,93,1,1
I0508 12:19:04.416952  2310 net.cpp:120] Top shape: 100 93 1 1 (9300)
I0508 12:19:04.416966  2310 net.cpp:120] Top shape: 100 (100)
I0508 12:19:04.416972  2310 layer_factory.hpp:74] Creating layer label_simple_1_split
I0508 12:19:04.416990  2310 net.cpp:84] Creating Layer label_simple_1_split
I0508 12:19:04.416999  2310 net.cpp:380] label_simple_1_split <- label
I0508 12:19:04.417007  2310 net.cpp:338] label_simple_1_split -> label_simple_1_split_0
I0508 12:19:04.417021  2310 net.cpp:338] label_simple_1_split -> label_simple_1_split_1
I0508 12:19:04.417035  2310 net.cpp:113] Setting up label_simple_1_split
I0508 12:19:04.417049  2310 net.cpp:120] Top shape: 100 (100)
I0508 12:19:04.417060  2310 net.cpp:120] Top shape: 100 (100)
I0508 12:19:04.417067  2310 layer_factory.hpp:74] Creating layer ip1
I0508 12:19:04.417076  2310 net.cpp:84] Creating Layer ip1
I0508 12:19:04.417083  2310 net.cpp:380] ip1 <- data
I0508 12:19:04.417091  2310 net.cpp:338] ip1 -> ip1
I0508 12:19:04.417104  2310 net.cpp:113] Setting up ip1
I0508 12:19:04.417143  2310 net.cpp:120] Top shape: 100 30 (3000)
I0508 12:19:04.417160  2310 layer_factory.hpp:74] Creating layer relu1
I0508 12:19:04.417172  2310 net.cpp:84] Creating Layer relu1
I0508 12:19:04.417179  2310 net.cpp:380] relu1 <- ip1
I0508 12:19:04.417186  2310 net.cpp:327] relu1 -> ip1 (in-place)
I0508 12:19:04.417198  2310 net.cpp:113] Setting up relu1
I0508 12:19:04.417255  2310 net.cpp:120] Top shape: 100 30 (3000)
I0508 12:19:04.417268  2310 layer_factory.hpp:74] Creating layer ip2
I0508 12:19:04.417278  2310 net.cpp:84] Creating Layer ip2
I0508 12:19:04.417286  2310 net.cpp:380] ip2 <- ip1
I0508 12:19:04.417295  2310 net.cpp:338] ip2 -> ip2
I0508 12:19:04.417307  2310 net.cpp:113] Setting up ip2
I0508 12:19:04.417325  2310 net.cpp:120] Top shape: 100 9 (900)
I0508 12:19:04.417338  2310 layer_factory.hpp:74] Creating layer ip2_ip2_0_split
I0508 12:19:04.417347  2310 net.cpp:84] Creating Layer ip2_ip2_0_split
I0508 12:19:04.417369  2310 net.cpp:380] ip2_ip2_0_split <- ip2
I0508 12:19:04.417377  2310 net.cpp:338] ip2_ip2_0_split -> ip2_ip2_0_split_0
I0508 12:19:04.417392  2310 net.cpp:338] ip2_ip2_0_split -> ip2_ip2_0_split_1
I0508 12:19:04.417400  2310 net.cpp:113] Setting up ip2_ip2_0_split
I0508 12:19:04.417410  2310 net.cpp:120] Top shape: 100 9 (900)
I0508 12:19:04.417418  2310 net.cpp:120] Top shape: 100 9 (900)
I0508 12:19:04.417423  2310 layer_factory.hpp:74] Creating layer accuracy
I0508 12:19:04.417459  2310 net.cpp:84] Creating Layer accuracy
I0508 12:19:04.417474  2310 net.cpp:380] accuracy <- ip2_ip2_0_split_0
I0508 12:19:04.417480  2310 net.cpp:380] accuracy <- label_simple_1_split_0
I0508 12:19:04.417489  2310 net.cpp:338] accuracy -> accuracy
I0508 12:19:04.417498  2310 net.cpp:113] Setting up accuracy
I0508 12:19:04.417511  2310 net.cpp:120] Top shape: (1)
I0508 12:19:04.417521  2310 layer_factory.hpp:74] Creating layer loss
I0508 12:19:04.417528  2310 net.cpp:84] Creating Layer loss
I0508 12:19:04.417537  2310 net.cpp:380] loss <- ip2_ip2_0_split_1
I0508 12:19:04.417543  2310 net.cpp:380] loss <- label_simple_1_split_1
I0508 12:19:04.417556  2310 net.cpp:338] loss -> loss
I0508 12:19:04.417563  2310 net.cpp:113] Setting up loss
I0508 12:19:04.417574  2310 layer_factory.hpp:74] Creating layer loss
I0508 12:19:04.417726  2310 net.cpp:120] Top shape: (1)
I0508 12:19:04.417742  2310 net.cpp:122]     with loss weight 1
I0508 12:19:04.417750  2310 net.cpp:167] loss needs backward computation.
I0508 12:19:04.417757  2310 net.cpp:169] accuracy does not need backward computation.
I0508 12:19:04.417764  2310 net.cpp:167] ip2_ip2_0_split needs backward computation.
I0508 12:19:04.417768  2310 net.cpp:167] ip2 needs backward computation.
I0508 12:19:04.417773  2310 net.cpp:167] relu1 needs backward computation.
I0508 12:19:04.417779  2310 net.cpp:167] ip1 needs backward computation.
I0508 12:19:04.417785  2310 net.cpp:169] label_simple_1_split does not need backward computation.
I0508 12:19:04.417790  2310 net.cpp:169] simple does not need backward computation.
I0508 12:19:04.417795  2310 net.cpp:205] This network produces output accuracy
I0508 12:19:04.417800  2310 net.cpp:205] This network produces output loss
I0508 12:19:04.417815  2310 net.cpp:447] Collecting Learning Rate and Weight Decay.
I0508 12:19:04.417824  2310 net.cpp:217] Network initialization done.
I0508 12:19:04.417829  2310 net.cpp:218] Memory required for data: 73208
I0508 12:19:04.417873  2310 solver.cpp:42] Solver scaffolding done.
I0508 12:19:04.417899  2310 solver.cpp:222] Solving otto
I0508 12:19:04.417911  2310 solver.cpp:223] Learning Rate Policy: inv
I0508 12:19:04.417919  2310 solver.cpp:266] Iteration 0, Testing net (#0)
I0508 12:19:04.460979  2310 solver.cpp:315]     Test net output #0: accuracy = 0.1831
I0508 12:19:04.461011  2310 solver.cpp:315]     Test net output #1: loss = 2.25925 (* 1 = 2.25925 loss)
I0508 12:19:04.461773  2310 solver.cpp:189] Iteration 0, loss = 2.14817
I0508 12:19:04.461799  2310 solver.cpp:204]     Train net output #0: loss = 2.14817 (* 1 = 2.14817 loss)
I0508 12:19:04.461817  2310 solver.cpp:464] Iteration 0, lr = 0.01
I0508 12:19:09.527751  2310 solver.cpp:266] Iteration 10000, Testing net (#0)
I0508 12:19:09.571025  2310 solver.cpp:315]     Test net output #0: accuracy = 0.7784
I0508 12:19:09.571063  2310 solver.cpp:315]     Test net output #1: loss = 0.558455 (* 1 = 0.558455 loss)
I0508 12:19:09.571552  2310 solver.cpp:189] Iteration 10000, loss = 0.577026
I0508 12:19:09.571574  2310 solver.cpp:204]     Train net output #0: loss = 0.577026 (* 1 = 0.577026 loss)
I0508 12:19:09.571591  2310 solver.cpp:464] Iteration 10000, lr = 0.00594604
I0508 12:19:14.722017  2310 solver.cpp:266] Iteration 20000, Testing net (#0)
I0508 12:19:14.765653  2310 solver.cpp:315]     Test net output #0: accuracy = 0.7919
I0508 12:19:14.765692  2310 solver.cpp:315]     Test net output #1: loss = 0.537067 (* 1 = 0.537067 loss)
I0508 12:19:14.766103  2310 solver.cpp:189] Iteration 20000, loss = 0.632795
I0508 12:19:14.766165  2310 solver.cpp:204]     Train net output #0: loss = 0.632795 (* 1 = 0.632795 loss)
I0508 12:19:14.766178  2310 solver.cpp:464] Iteration 20000, lr = 0.00438691
I0508 12:19:19.922034  2310 solver.cpp:266] Iteration 30000, Testing net (#0)
I0508 12:19:19.964999  2310 solver.cpp:315]     Test net output #0: accuracy = 0.7859
I0508 12:19:19.965039  2310 solver.cpp:315]     Test net output #1: loss = 0.535475 (* 1 = 0.535475 loss)
I0508 12:19:19.965441  2310 solver.cpp:189] Iteration 30000, loss = 0.697927
I0508 12:19:19.965468  2310 solver.cpp:204]     Train net output #0: loss = 0.697927 (* 1 = 0.697927 loss)
I0508 12:19:19.965479  2310 solver.cpp:464] Iteration 30000, lr = 0.00353553
I0508 12:19:25.167050  2310 solver.cpp:266] Iteration 40000, Testing net (#0)
I0508 12:19:25.210865  2310 solver.cpp:315]     Test net output #0: accuracy = 0.7933
I0508 12:19:25.210904  2310 solver.cpp:315]     Test net output #1: loss = 0.534498 (* 1 = 0.534498 loss)
I0508 12:19:25.211313  2310 solver.cpp:189] Iteration 40000, loss = 0.352082
I0508 12:19:25.211338  2310 solver.cpp:204]     Train net output #0: loss = 0.352082 (* 1 = 0.352082 loss)
I0508 12:19:25.211354  2310 solver.cpp:464] Iteration 40000, lr = 0.0029907
I0508 12:19:30.322751  2310 solver.cpp:266] Iteration 50000, Testing net (#0)
I0508 12:19:30.367033  2310 solver.cpp:315]     Test net output #0: accuracy = 0.7888
I0508 12:19:30.367086  2310 solver.cpp:315]     Test net output #1: loss = 0.530166 (* 1 = 0.530166 loss)
I0508 12:19:30.367511  2310 solver.cpp:189] Iteration 50000, loss = 0.586298
I0508 12:19:30.367533  2310 solver.cpp:204]     Train net output #0: loss = 0.586298 (* 1 = 0.586298 loss)
I0508 12:19:30.367544  2310 solver.cpp:464] Iteration 50000, lr = 0.00260847
I0508 12:19:35.499639  2310 solver.cpp:266] Iteration 60000, Testing net (#0)
I0508 12:19:35.544010  2310 solver.cpp:315]     Test net output #0: accuracy = 0.7936
I0508 12:19:35.544049  2310 solver.cpp:315]     Test net output #1: loss = 0.532958 (* 1 = 0.532958 loss)
I0508 12:19:35.544478  2310 solver.cpp:189] Iteration 60000, loss = 0.490088
I0508 12:19:35.544502  2310 solver.cpp:204]     Train net output #0: loss = 0.490088 (* 1 = 0.490088 loss)
I0508 12:19:35.544514  2310 solver.cpp:464] Iteration 60000, lr = 0.00232368
I0508 12:19:40.707180  2310 solver.cpp:266] Iteration 70000, Testing net (#0)
I0508 12:19:40.749495  2310 solver.cpp:315]     Test net output #0: accuracy = 0.7839
I0508 12:19:40.749534  2310 solver.cpp:315]     Test net output #1: loss = 0.530616 (* 1 = 0.530616 loss)
I0508 12:19:40.749927  2310 solver.cpp:189] Iteration 70000, loss = 0.350372
I0508 12:19:40.749953  2310 solver.cpp:204]     Train net output #0: loss = 0.350372 (* 1 = 0.350372 loss)
I0508 12:19:40.749966  2310 solver.cpp:464] Iteration 70000, lr = 0.00210224
I0508 12:19:45.909342  2310 solver.cpp:266] Iteration 80000, Testing net (#0)
I0508 12:19:45.953330  2310 solver.cpp:315]     Test net output #0: accuracy = 0.7882
I0508 12:19:45.953367  2310 solver.cpp:315]     Test net output #1: loss = 0.529338 (* 1 = 0.529338 loss)
I0508 12:19:45.953785  2310 solver.cpp:189] Iteration 80000, loss = 0.436605
I0508 12:19:45.953809  2310 solver.cpp:204]     Train net output #0: loss = 0.436605 (* 1 = 0.436605 loss)
I0508 12:19:45.953819  2310 solver.cpp:464] Iteration 80000, lr = 0.0019245
I0508 12:19:51.088989  2310 solver.cpp:266] Iteration 90000, Testing net (#0)
I0508 12:19:51.134217  2310 solver.cpp:315]     Test net output #0: accuracy = 0.787
I0508 12:19:51.134254  2310 solver.cpp:315]     Test net output #1: loss = 0.525386 (* 1 = 0.525386 loss)
I0508 12:19:51.134697  2310 solver.cpp:189] Iteration 90000, loss = 0.521645
I0508 12:19:51.134722  2310 solver.cpp:204]     Train net output #0: loss = 0.521645 (* 1 = 0.521645 loss)
I0508 12:19:51.134732  2310 solver.cpp:464] Iteration 90000, lr = 0.00177828
I0508 12:19:56.294564  2310 solver.cpp:334] Snapshotting to _iter_100001.caffemodel
I0508 12:19:56.294857  2310 solver.cpp:342] Snapshotting solver state to _iter_100001.solverstate
I0508 12:19:56.295253  2310 solver.cpp:248] Iteration 100000, loss = 0.353043
I0508 12:19:56.295280  2310 solver.cpp:266] Iteration 100000, Testing net (#0)
I0508 12:19:56.338987  2310 solver.cpp:315]     Test net output #0: accuracy = 0.7866
I0508 12:19:56.339017  2310 solver.cpp:315]     Test net output #1: loss = 0.529087 (* 1 = 0.529087 loss)
I0508 12:19:56.339030  2310 solver.cpp:253] Optimization Done.
I0508 12:19:56.339035  2310 caffe.cpp:134] Optimization Done.

Applying the Model

In [15]:
net = caffe.Net("model_prod.prototxt","./_iter_100001.caffemodel", caffe.TEST)
In [16]:
l, f = get_data_for_case_from_lmdb("/home/ubuntu/data/test_data_lmdb/", "00001230")
out = net.forward(**{net.inputs[0]: np.asarray([f])})

# if the index of the largest element matches the integer
# label we stored for that case - then the prediction is right
print np.argmax(out["prob"][0]) == l, "\n", out
plt.bar(range(9),out["prob"][0])
True 
{'prob': array([[ 0.00519855,  0.83611858,  0.05120391,  0.07174591,  0.0020012 ,
         0.00854634,  0.0165364 ,  0.00283596,  0.00581307]], dtype=float32)}
Out[16]:
<Container object of 9 artists>

Visualizing the Network Graph

In [172]:
from google.protobuf import text_format
from caffe.draw import get_pydot_graph
from caffe.proto import caffe_pb2
from IPython.display import display, Image 

_net = caffe_pb2.NetParameter()
f = open("model_prod.prototxt")
text_format.Merge(f.read(), _net)
display(Image(get_pydot_graph(_net,"TB").create_png()))

Visualizing the Weights

In [192]:
# weights connecting the input with relu1
arr = net.params["ip1"][0].data
In [222]:
fig = plt.figure(figsize=(10,10))
ax = fig.add_subplot(111)
fig.colorbar(cax, orientation="horizontal")
cax = ax.matshow(arr, interpolation='none')
In [230]:
_ = plt.hist(arr.tolist(), bins=20)