Exercise 4.1¶

Import Libraries¶

Python requires importing libraries and functions you need to access specific tools like science (scipy), linear algebra (numpy), and graphics (matplotlib). These libraries can be installed using the pip command line tool. Alternatively you can install an python distribution like Anaconda or Canopy which have these and many other standard package pre-installed.

In [1]:

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import matplotlib.pyplot as plt # plotting
from skimage.io import imread # read in images
from skimage.segmentation import mark_boundaries # mark labels
from sklearn.metrics import roc_curve, auc # roc curve tools
from skimage.color import label2rgb
import numpy as np # linear algebra / matrices
# make the notebook interactive
from ipywidgets import interact, interactive, fixed 
import ipywidgets as widgets #add new widgets
from IPython.display import display

import os

In [2]:

base_path = '04-files'
seg_path = os.path.join(base_path,'DogVsMuffin_seg_bw.jpg')
rgb_path = os.path.join(base_path,'DogVsMuffin.jpg')
face_path = os.path.join(base_path,'DogVsMuffin_face.jpg')
seg_img = imread(seg_path)[80:520,:450]
rgb_img = imread(rgb_path)[80:520,:450,:]
face_img = imread(face_path)
print('RGB Size',rgb_img.shape,'Seg Size',seg_img.shape,'Face Size',face_img.shape)

RGB Size (440, 450, 3) Seg Size (440, 450) Face Size (111, 131, 3)

In [3]:

%matplotlib inline
fig, (ax1, ax2, ax3) = plt.subplots(1,3,figsize = (20,5))
ax1.imshow(rgb_img) # show the color image
ax1.set_title("Color Image")
ax2.imshow(seg_img, cmap='gray') # show the segments
ax2.set_title("Ground Truth")
ax3.imshow(mark_boundaries(rgb_img,seg_img))
ax3.set_title("Labeled Image")

Out[3]:

Text(0.5,1,'Labeled Image')

Creating a Simple ROC Curve¶

We use the score function of taking the mean of the red green and blue channels $$ I = \frac{R+G+B}{3} $$ We then take the score by normalizing by the maximum value (since the image is 8bit this is 255) $$ s = \frac{I}{255} $$

In [4]:

ground_truth_labels = seg_img.flatten()>0
score_value = 1-np.mean(rgb_img.astype(np.float32),2).flatten()/255.0
fpr, tpr, _ = roc_curve(ground_truth_labels,score_value)
roc_auc = auc(fpr,tpr)

In [5]:

%matplotlib inline
fig, ax = plt.subplots(1,1)
ax.plot(fpr, tpr, label='ROC curve (area = %0.2f)' % roc_auc)
ax.plot([0, 1], [0, 1], 'k--')
ax.set_xlim([0.0, 1.0])
ax.set_ylim([0.0, 1.05])
ax.set_xlabel('False Positive Rate')
ax.set_ylabel('True Positive Rate')
ax.set_title('Receiver operating characteristic example')
ax.legend(loc="lower right")

Out[5]:

<matplotlib.legend.Legend at 0x11e506e10>

Adding Filters¶

We can add a filter to this process by importing a uniform_filter and applying it before processing the image

In [6]:

%matplotlib inline
from scipy.ndimage.filters import uniform_filter
filter_size = 45
filtered_image = uniform_filter(np.mean(rgb_img,2),filter_size)
score_value = 1-filtered_image.astype(np.float32).flatten()/255.0
fpr2, tpr2, _ = roc_curve(ground_truth_labels,score_value)
roc_auc2 = auc(fpr2,tpr2)

fig, ax = plt.subplots(1,1)
ax.plot(fpr, tpr, label='Raw ROC curve (area = %0.2f)' % roc_auc)
ax.plot(fpr2, tpr2, label='Filtered ROC curve (area = %0.2f)' % roc_auc2)
ax.plot([0, 1], [0, 1], 'k--')
ax.set_xlim([0.0, 1.0])
ax.set_ylim([0.0, 1.05])
ax.set_xlabel('False Positive Rate')
ax.set_ylabel('True Positive Rate')
ax.set_title('Receiver operating characteristic example')
ax.legend(loc="lower right")

Out[6]:

<matplotlib.legend.Legend at 0x11f5addd8>

Tasks¶

How can you improve filtering in this analysis?

Which filter elements might improve the area under the ROC?
Try making workflows to test out a few different filters

Where might morphological operations fit in?

How can you make them part of this workflow as well?

(Challenge) Try and use the optimize toolbox of scipy with the fmin function (from scipy.optimize import fmin) to find the optimum parmeters for the highers area (hint: fmin finds the minimum value)

In [7]:

from scipy.optimize import fmin
def calc_auc(rv, gv, bv, fsize):
    filter_size = 45
    gray_image = (rv*rgb_img[:,:,0]+gv*rgb_img[:,:,1]+bv*rgb_img[:,:,2])/(rv+gv+bv)
    filtered_image = uniform_filter(gray_image,filter_size)
    score_value = filtered_image.astype(np.float32).flatten()/255.0
    fpr2, tpr2, _ = roc_curve(ground_truth_labels,score_value)
    return {'fpr':fpr2, 'tpr':tpr2, 'auc':auc(fpr2,tpr2), 'gimg': gray_image, 'fimg': filtered_image}

In [8]:

# test the function to make sure it works
min_func = lambda args: 1-calc_auc(*args)['auc']
min_start = [1,1,1,20]
min_func(min_start)

Out[8]:

0.5706766670578615

In [9]:

opt_res = fmin(min_func,min_start)

Optimization terminated successfully.
         Current function value: 0.334108
         Iterations: 299
         Function evaluations: 606

In [10]:

opt_values = calc_auc(*opt_res)
tprOpt = opt_values['tpr']
fprOpt = opt_values['fpr']
roc_aucOpt = opt_values['auc']

In [11]:

fig, (ax_img,ax) = plt.subplots(1,2, figsize = (20,10))
ax_img.imshow(opt_values['gimg'], cmap = 'gray')
ax_img.set_title('Transformed Color Image')
ax.plot(fpr, tpr, label='Raw ROC curve (area = %0.2f)' % roc_auc)
ax.plot(fprOpt, tprOpt, label='Optimized ROC curve (area = %0.2f)' % roc_aucOpt)
ax.plot([0, 1], [0, 1], 'k--')
ax.set_xlim([0.0, 1.0])
ax.set_ylim([0.0, 1.05])
ax.set_xlabel('False Positive Rate')
ax.set_ylabel('True Positive Rate')
ax.set_title('Receiver operating characteristic example')
ax.legend(loc="lower right")

Out[11]:

<matplotlib.legend.Legend at 0x11f29ceb8>

Non-linear optimization¶

Here we use non-linear approaches to improve the quality of the results

In [12]:

def relu(x): 
    return (x+np.abs(x))/2
def calc_auc_nl(rv, rm, gv, gm, bv, bm):
    filter_size = 45
    gray_image = (rv*relu(rgb_img[:,:,0]/255.0-rm)+gv*relu(rgb_img[:,:,1]/255.0-gm)+
                  bv*relu(rgb_img[:,:,2]/255.0-bm))/(rv+gv+bv)
    score_value = gray_image.astype(np.float32).flatten()
    fpr2, tpr2, _ = roc_curve(ground_truth_labels,score_value)
    return {'fpr': fpr2, 'tpr': tpr2, 'auc':auc(fpr2,tpr2), 'gimg': gray_image, 'fimg': filtered_image}

In [13]:

# test the function to make sure it works
min_func = lambda args: 1-calc_auc_nl(*args)['auc']
min_start = [1,0,1,0,1,0]
min_start[0] = opt_res[0]
min_start[2] = opt_res[1]
min_start[4] = opt_res[2]
min_func(min_start)

Out[13]:

0.38524423892945336

In [14]:

opt_res = fmin(min_func,min_start, maxiter = 100)

Warning: Maximum number of iterations has been exceeded.

In [15]:

opt_values_nl = calc_auc_nl(*opt_res)
tprOpt_nl = opt_values_nl['tpr']
fprOpt_nl = opt_values_nl['fpr']
roc_aucOpt_nl = opt_values_nl['auc']

In [16]:

fig, (ax_img,ax) = plt.subplots(1,2, figsize = (20,10))
ax_img.imshow(opt_values_nl['gimg'], cmap='gray')
ax_img.set_title('Transformed Color Image')
ax.plot(fpr, tpr, label='Raw ROC curve (area = %0.2f)' % roc_auc)
ax.plot(fprOpt, tprOpt, label='Optimized ROC curve (area = %0.2f)' % roc_aucOpt)
ax.plot(fprOpt_nl, tprOpt_nl, label='NL Optimized ROC curve (area = %0.2f)' % roc_aucOpt_nl)
ax.plot([0, 1], [0, 1], 'k--')
ax.set_xlim([0.0, 1.0])
ax.set_ylim([0.0, 1.05])
ax.set_xlabel('False Positive Rate')
ax.set_ylabel('True Positive Rate')
ax.set_title('Receiver operating characteristic example')
ax.legend(loc="lower right")

Out[16]:

<matplotlib.legend.Legend at 0x11f8b27b8>

Next Steps¶

Rather than simply adjusting basic parameters, we can adjust entire arrays of information. The example below is the a convolutional neural network with one two layers

In [17]:

#from scipy.ndimage.filters import convolve
from scipy.signal import fftconvolve
convolve = lambda img1,img2: fftconvolve(img1,img2,mode='same')
CONV_SIZE = (10,10,1)
grey_img = np.reshape(np.mean(rgb_img,2)/255.0,(rgb_img.shape[0],rgb_img.shape[1],1))
def calc_auc_conv(rcoefs):
    coefs = rcoefs.reshape(CONV_SIZE)/rcoefs.sum()
    score_image = relu(convolve(grey_img,coefs))
    score_value = score_image.flatten()
    fpr2, tpr2, _ = roc_curve(ground_truth_labels,score_value)
    return {'fpr': fpr2, 'tpr': tpr2, 'auc': auc(fpr2,tpr2), 'gimg': score_image}

In [18]:

# test the function to make sure it works
min_func = lambda rcoefs: 1-calc_auc_conv(rcoefs)['auc']
min_start = np.random.uniform(-1, 1, size = CONV_SIZE)
min_func(min_start)

Out[18]:

0.6067390144774316

In [19]:

%%time
opt_res_conv = fmin(min_func,
                    min_start, 
                    maxiter = 100)

Warning: Maximum number of iterations has been exceeded.
CPU times: user 19 s, sys: 5.68 s, total: 24.7 s
Wall time: 40.7 s

In [20]:

opt_values_conv = calc_auc_conv(opt_res_conv)
tprOpt_conv = opt_values_conv['tpr']
fprOpt_conv = opt_values_conv['fpr']
roc_aucOpt_conv = opt_values_conv['auc']
out_kernel = opt_res_conv.reshape(CONV_SIZE)/opt_res_conv.sum()
fig, ax_all = plt.subplots(1,out_kernel.shape[2])
for i,c_ax in enumerate(np.array(ax_all).flatten()):
    c_ax.imshow(out_kernel[:,:,i])
    c_ax.set_title(str(i))

In [21]:

fig, (ax_img,ax) = plt.subplots(1,2, figsize = (20,10))
ax_img.imshow(opt_values_conv['gimg'].squeeze(), cmap='gray')
ax_img.set_title('Transformed Color Image')
ax.plot(fpr, tpr, label='Raw ROC curve (area = %0.2f)' % roc_auc)
ax.plot(fprOpt, tprOpt, label='Optimized ROC curve (area = %0.2f)' % roc_aucOpt)
ax.plot(fprOpt_conv, tprOpt_conv, label='CNN Optimized ROC curve (area = %0.2f)' % roc_aucOpt_conv)
ax.plot([0, 1], [0, 1], 'k--')
ax.set_xlim([0.0, 1.0])
ax.set_ylim([0.0, 1.05])
ax.set_xlabel('False Positive Rate')
ax.set_ylabel('True Positive Rate')
ax.set_title('Receiver operating characteristic example')
ax.legend(loc="lower right")

Out[21]:

<matplotlib.legend.Legend at 0x121698710>

RGB CNN¶

Using the RGB instead of the gray value for the CNN

In [22]:

CONV_SIZE = (10, 10, 3)
def calc_auc_conv2d(rcoefs):
    coefs = rcoefs.reshape(CONV_SIZE)/rcoefs.sum()
    score_image = relu(convolve(grey_img,coefs))
    score_value = score_image.flatten()
    fpr2, tpr2, _ = roc_curve(ground_truth_labels,score_value)
    return {'fpr': fpr2, 'tpr': tpr2, 'auc': auc(fpr2,tpr2), 'gimg': score_image}

In [23]:

%%time
# test the function to make sure it works
min_func = lambda rcoefs: 1-calc_auc_conv2d(rcoefs)['auc']
min_start = np.random.uniform(-1, 1, size = CONV_SIZE).ravel()
for i in range(10): min_func(min_start)

CPU times: user 1.42 s, sys: 564 ms, total: 1.98 s
Wall time: 3.36 s

In [24]:

%%time
opt_res_conv2d = fmin(min_func, min_start, maxfun = 2, maxiter = 1)
#opt_res_conv2d = min_start

Warning: Maximum number of function evaluations has been exceeded.
CPU times: user 34.8 s, sys: 13.8 s, total: 48.6 s
Wall time: 1min

In [25]:

%matplotlib inline
opt_values_conv = calc_auc_conv2d(opt_res_conv2d)
tprOpt_conv = opt_values_conv['tpr']
fprOpt_conv = opt_values_conv['fpr']
roc_aucOpt_conv = opt_values_conv['auc']
out_kernel = opt_res_conv2d.reshape(CONV_SIZE)/opt_res_conv.sum()
fig, ax_all = plt.subplots(1,out_kernel.shape[2])
for i,c_ax in enumerate(np.array(ax_all).flatten()):
    c_ax.imshow(out_kernel[:,:,i])
    c_ax.set_title(str(i))

In [ ]:

fig, (ax_img,ax) = plt.subplots(1,2, figsize = (20,10))
ax_img.imshow(mark_boundaries(opt_values_conv['gimg'].squeeze(),seg_img), cmap='gray')
ax_img.set_title('Transformed Color Image')
ax.plot(fpr, tpr, label='Raw ROC curve (area = %0.2f)' % roc_auc)
ax.plot(fprOpt, tprOpt, label='Optimized ROC curve (area = %0.2f)' % roc_aucOpt)
ax.plot(fprOpt_conv, tprOpt_conv, label='CNN Optimized ROC curve (area = %0.2f)' % roc_aucOpt_conv)
ax.plot([0, 1], [0, 1], 'k--')
ax.set_xlim([0.0, 1.0])
ax.set_ylim([0.0, 1.05])
ax.set_xlabel('False Positive Rate')
ax.set_ylabel('True Positive Rate')
ax.set_title('Receiver operating characteristic example')
ax.legend(loc="lower right")

Out[ ]:

<matplotlib.legend.Legend at 0x11f98c5c0>

Tasks¶

How can you improve filtering in this analysis?

Which filter elements might improve the area under the ROC?
Try making workflows to test out a few different filters

Where might morphological operations fit in?

How can you make them part of this workflow as well?

(Challenge) Try and use the optimize toolbox of scipy with the fmin function (from scipy.optimize import fmin) to find the optimum parmeters for the highers area (hint: fmin finds the minimum value)

In [ ]: