In this you will learn all about the following:
AWS is a cloud computing service offering a range of managed and unmanaged services. We will be working with EC2 which is commonly used for performing "Machine Learning" processing in production environment.
AWS EC2 offers a FREE Trier which is what we will be working with. So if you don't have an AWS account yet, please create one.
Ubuntu (VM) Install:
Start Server:
ipython notebook
We will be coding most of our work in python, so it important to be familiar with the language. One good I found was "Learn Python the Hard Way" which is available for free online.
http://learnpythonthehardway.org/book/
I'll go over some basics that will get you started.
x = 10
y = 5
z = x + y
print z
word_1 = "Hello"
word_2 = "Python"
sentence = word_1 + " " + word_2
print sentence
# Let's do some math
print x + y
print x - y
print x * y
print x / y
# Formating String
template = "Hello %s"
print template % "Python"
# Using string formating to improve math results
print "Add Result: %s" % (x + y)
print "Subtract Result: %s" % (x - y)
print "Multiply Result: %s" % (x * y)
print "Divide Result: %s" % (x / y)
# Passing Multiple Parameters to a string
template = "%s %s %s = %s"
print template % (x, "+", y, x+y)
print template % (x, "-", y, x-y)
print template % (x, "*", y, x*y)
print template % (x, "/", y, x/y)
# List, Tupal and Dictionary
numbers_list = [1,2,3,4,5]
numbers_tupal = (1,2,3,4,5)
numbers_dict = {1:1, 2:2, 3:3, 4:4, 5:5}
# List
print "Full List: %s" % numbers_list
print "List Length: %s" % len(numbers_list)
print "First Item: %s" % numbers_list[0]
print "Last Item: %s" % numbers_list[-1]
print ""
# Tupal
print "Full Tupal: %s" % str(numbers_tupal)
print "Tupal Length: %s" % len(numbers_tupal)
print "First Item: %s" % numbers_tupal[0]
print "Last Item: %s" % numbers_tupal[-1]
print ""
# Dictionary
print "Full Dictionary: %s" % str(numbers_dict)
print "Dictionary Length: %s" % len(numbers_dict)
print "First Item: %s" % numbers_dict.values()[0]
print "Last Item: %s" % numbers_dict.values()[-1]
print ""
IPython Notebook will be our Intergrated Development Environment (IDE). It is web-based and support interactive development. This comes in handy once we are working with data and we would develop out logic based on the data and results.
In can write code and execute it on the spot and have others access your work to truely work as a team.
It also support inline charts and LATEX to visualize your work and results.
Let's test LATEX first to see can it can make our functions more readable
LATEX code for a function
A good resource for finding chart examples is:
import matplotlib.pyplot as plt
X = [1,2,3,4,5]
Y = [3,2,5,0,1]
plt.plot(X, Y)
plt.scatter(X, Y)
plt.scatter(X, Y, c=Y)
plt.scatter( X, Y, c=Y, marker="*", s=400)
# Complete list of markers
# http://matplotlib.org/api/markers_api.html
from mpl_toolkits.mplot3d import axes3d
import matplotlib.pyplot as plt
from matplotlib import cm
fig = plt.figure(figsize=(10,10))
ax = fig.gca(projection='3d')
X, Y, Z = axes3d.get_test_data(0.05)
ax.plot_surface(X, Y, Z, rstride=4, cstride=4, alpha=0.3)
cset = ax.contourf(X, Y, Z, zdir='z', offset=-100, cmap=cm.coolwarm)
cset = ax.contourf(X, Y, Z, zdir='x', offset=-40, cmap=cm.coolwarm)
cset = ax.contourf(X, Y, Z, zdir='y', offset=40, cmap=cm.coolwarm)
ax.set_xlabel('X')
ax.set_xlim(-40, 40)
ax.set_ylabel('Y')
ax.set_ylim(-40, 40)
ax.set_zlabel('Z')
ax.set_zlim(-100, 100)
plt.show()
Numpy: A powerful library for dealing with large amount of numbers. It performs operation much faster than pure python.
Pandas: Introduces DataSeries and DataFrame which are based on Numpy arrays. These objects provide many out-of-the-box tools to analize and perform complex operations on your data.
import numpy as np
import pandas as pd
array = np.arange(100)
array = array.reshape((10,10))
df = pd.DataFrame(array)
print df
print df[1] - df[0]
print df[1] + df[0]
print df[1] * df[0] - df[3]
We can read CSV file located on our system or on the internet.
You can open Ajenti by opening this link:
from IPython.display import HTML
input_form = """
<a id="admin_link" target="_blank" href="#">Ajenti Administration Interface</a>
<p>User: root<br> Password: admin</p>
"""
javascript = """
<script type="text/Javascript">
document.getElementById('admin_link').href = "https://" + window.location.hostname + ":8000"
</script>
"""
HTML(input_form + javascript)
User: root
Password: admin
csv_data = pd.read_csv("data/yourfile.csv")
For questions please leave them on:
Next Lesson - Intoroduction to Machine Learning
In the next lesson: