ESF projekt Západočeské univerzity v Plzni reg. č. CZ.02.2.69/0.0/0.0/16 015/0002287

Image Classification - clasical approaches

Basic types of classifiers:

  • K-means
  • k-Nearest Neighbour
  • Bayesian classifier
  • Support Vector Machine

Types of learning:

  • Supervised learning
  • Unsupervised learning
  • Reinforcement learning
In [1]:
# scikit-learn 
%pylab inline --no-import-all
from sklearn import datasets
import numpy as np
import sklearn.model_selection 
Populating the interactive namespace from numpy and matplotlib

Iris dataset

  • Načtení trénovacích dat. Jde o kosatec (iris flower) a jeho tři poddruhy: Iris setosa, Iris versicolor, Iris virginica. Měří se délka kalichu, šířka kalichu, délka okvětního lístku a šířka okvětního lístku.
In [2]:
iris = datasets.load_iris()
# cílové třídy
# rozměry dat
print("data ", iris.data.shape)
print(iris.data[-10:,:])

print("")
print("target", iris.target.shape)
print(np.unique(iris.target))
print(iris.target[-10:])
data  (150, 4)
[[6.7 3.1 5.6 2.4]
 [6.9 3.1 5.1 2.3]
 [5.8 2.7 5.1 1.9]
 [6.8 3.2 5.9 2.3]
 [6.7 3.3 5.7 2.5]
 [6.7 3.  5.2 2.3]
 [6.3 2.5 5.  1.9]
 [6.5 3.  5.2 2. ]
 [6.2 3.4 5.4 2.3]
 [5.9 3.  5.1 1.8]]

target (150,)
[0 1 2]
[2 2 2 2 2 2 2 2 2 2]

k-Nearest Neighbour classifier

In [4]:
from sklearn import neighbors
knn = neighbors.KNeighborsClassifier()
knn.fit(iris.data, iris.target) 
#KNeighborsClassifier(...)
predikce = knn.predict([[0.1, 0.2, 0.3, 0.4]])
print(predikce)
#array([0])
[0]
In [5]:
perm = np.random.permutation(iris.target.size)
iris.data = iris.data[perm]
iris.target = iris.target[perm]

train_data = iris.data[:100]
train_target = iris.target[:100]

test_data = iris.data[100:]
test_target = iris.target[100:]

knn.fit(train_data, train_target) 

knn.score(test_data, test_target) 
Out[5]:
0.96

Bayesian classifier

In [6]:
import sklearn.naive_bayes
gnb = sklearn.naive_bayes.GaussianNB()
gnb.fit(train_data, train_target)
y_pred = gnb.predict(test_data)
print("Number of mislabeled points : %d" % (test_target != y_pred).sum())
Number of mislabeled points : 2

SVM classifier

In [7]:
from sklearn import svm
svc = svm.SVC()
svc.fit(train_data, train_target) 
y_pred = svc.predict(test_data)
print("Number of mislabeled points : %d" % (test_target != y_pred).sum())
Number of mislabeled points : 2
c:\python36\lib\site-packages\sklearn\svm\base.py:193: FutureWarning: The default value of gamma will change from 'auto' to 'scale' in version 0.22 to account better for unscaled features. Set gamma explicitly to 'auto' or 'scale' to avoid this warning.
  "avoid this warning.", FutureWarning)

Training data

Testing data

In [11]:
import scipy
import urllib
import skimage
import skimage.color
import skimage.measure
import skimage.io
from sklearn import svm


# URL = "http://uc452cam01-kky.fav.zcu.cz/snapshot.jpg"
URL = "https://raw.githubusercontent.com/mjirik/ZDO/master/objekty/ctverce_hvezdy_kolecka.jpg"
img = skimage.io.imread(URL, as_grey=True)
plt.imshow(img)
# doporučený klasifikátor ...
# pozor na labeling a "+1 problém"
c:\python36\lib\site-packages\skimage\io\_io.py:48: UserWarning: `as_grey` has been deprecated in favor of `as_gray`
  warn('`as_grey` has been deprecated in favor of `as_gray`')
Out[11]:
<matplotlib.image.AxesImage at 0x21d51b5a4e0>

Titanic