ESF projekt Západočeské univerzity v Plzni reg. č. CZ.02.2.69/0.0/0.0/16 015/0002287

# Image Classification - clasical approaches¶

Basic types of classifiers:

• K-means
• k-Nearest Neighbour
• Bayesian classifier
• Support Vector Machine

Types of learning:

• Supervised learning
• Unsupervised learning
• Reinforcement learning
In [1]:
# scikit-learn
%pylab inline --no-import-all
from sklearn import datasets
import numpy as np
import sklearn.model_selection

Populating the interactive namespace from numpy and matplotlib


### Iris dataset¶

• Načtení trénovacích dat. Jde o kosatec (iris flower) a jeho tři poddruhy: Iris setosa, Iris versicolor, Iris virginica. Měří se délka kalichu, šířka kalichu, délka okvětního lístku a šířka okvětního lístku.
In [2]:
iris = datasets.load_iris()
# cílové třídy
# rozměry dat
print("data ", iris.data.shape)
print(iris.data[-10:,:])

print("")
print("target", iris.target.shape)
print(np.unique(iris.target))
print(iris.target[-10:])

data  (150, 4)
[[6.7 3.1 5.6 2.4]
[6.9 3.1 5.1 2.3]
[5.8 2.7 5.1 1.9]
[6.8 3.2 5.9 2.3]
[6.7 3.3 5.7 2.5]
[6.7 3.  5.2 2.3]
[6.3 2.5 5.  1.9]
[6.5 3.  5.2 2. ]
[6.2 3.4 5.4 2.3]
[5.9 3.  5.1 1.8]]

target (150,)
[0 1 2]
[2 2 2 2 2 2 2 2 2 2]


### k-Nearest Neighbour classifier¶

In [4]:
from sklearn import neighbors
knn = neighbors.KNeighborsClassifier()
knn.fit(iris.data, iris.target)
#KNeighborsClassifier(...)
predikce = knn.predict([[0.1, 0.2, 0.3, 0.4]])
print(predikce)
#array([0])

[0]

In [5]:
perm = np.random.permutation(iris.target.size)
iris.data = iris.data[perm]
iris.target = iris.target[perm]

train_data = iris.data[:100]
train_target = iris.target[:100]

test_data = iris.data[100:]
test_target = iris.target[100:]

knn.fit(train_data, train_target)

knn.score(test_data, test_target)

Out[5]:
0.96

### Bayesian classifier¶

In [6]:
import sklearn.naive_bayes
gnb = sklearn.naive_bayes.GaussianNB()
gnb.fit(train_data, train_target)
y_pred = gnb.predict(test_data)
print("Number of mislabeled points : %d" % (test_target != y_pred).sum())

Number of mislabeled points : 2


### SVM classifier¶

In [7]:
from sklearn import svm
svc = svm.SVC()
svc.fit(train_data, train_target)
y_pred = svc.predict(test_data)
print("Number of mislabeled points : %d" % (test_target != y_pred).sum())

Number of mislabeled points : 2

c:\python36\lib\site-packages\sklearn\svm\base.py:193: FutureWarning: The default value of gamma will change from 'auto' to 'scale' in version 0.22 to account better for unscaled features. Set gamma explicitly to 'auto' or 'scale' to avoid this warning.
"avoid this warning.", FutureWarning)


### Testing data¶

In [11]:
import scipy
import urllib
import skimage
import skimage.color
import skimage.measure
import skimage.io
from sklearn import svm

# URL = "http://uc452cam01-kky.fav.zcu.cz/snapshot.jpg"
URL = "https://raw.githubusercontent.com/mjirik/ZDO/master/objekty/ctverce_hvezdy_kolecka.jpg"

c:\python36\lib\site-packages\skimage\io\_io.py:48: UserWarning: as_grey has been deprecated in favor of as_gray
warn('as_grey has been deprecated in favor of as_gray')

<matplotlib.image.AxesImage at 0x21d51b5a4e0>