Deep Learning http://www.deeplearningbook.org
Neural Networks and Deep Learning http://neuralnetworksanddeeplearning.com/index.html
Let’s start with a simple example.
Say you’re helping a friend who wants to buy a house.
She was quoted $400,000 for a 2000 sq ft house (185 meters).
Is this a good price or not?
So you ask your friends who have bought houses in that same neighborhoods, and you end up with three data points:
Area (sq ft) (x) | Price (y) |
---|---|
2,104 | 399,900 |
1,600 | 329,900 |
2,400 | 369,000 |
A simple predictive model (“regression model”)
Model Evaluation
Loss Function (also, cost function)
Gradient Descent
The softmax function, also known as softargmax or normalized exponential function, is a function that takes as input a vector of K real numbers, and normalizes it into a probability distribution consisting of K probabilities.
$$softmax = \frac{e^x}{\sum e^x}$$def softmax(s):
return np.exp(s) / np.sum(np.exp(s), axis=0)
softmax([1.0, 2.0, 3.0, 4.0, 1.0, 2.0, 3.0])
array([0.02364054, 0.06426166, 0.1746813 , 0.474833 , 0.02364054, 0.06426166, 0.1746813 ])
That is, prior to applying softmax, some vector components could be negative, or greater than one; and might not sum to 1; but after applying softmax, each component will be in the interval (0,1), and the components will add up to 1, so that they can be interpreted as probabilities.
Furthermore, the larger input components will correspond to larger probabilities.
Softmax is often used in neural networks, to map the non-normalized output of a network to a probability distribution over predicted output classes.
import numpy as np
def sigmoid(x):
return 1/(1 + np.exp(-x))
sigmoid(0.5)
0.6224593312018546
# Naive scalar relu implementation.
# In the real world, most calculations are done on vectors
def relu(x):
if x < 0:
return 0
else:
return x
relu(0.5)
0.5
It’s a Python-based scientific computing package targeted at two sets of audiences:
%matplotlib inline
import matplotlib.pyplot as plt
import torch
from torch import nn, optim
from torch.autograd import Variable
import numpy as np
x_train = np.array([[2104],[1600],[2400]], dtype=np.float32)
y_train = np.array([[399.900], [329.900], [369.000]], dtype=np.float32)
# x_train = np.array([[3.3], [4.4], [5.5], [6.71], [6.93], [4.168],
# [9.779], [6.182], [7.59], [2.167], [7.042],
# [10.791], [5.313], [7.997], [3.1]], dtype=np.float32)
# y_train = np.array([[1.7], [2.76], [2.09], [3.19], [1.694], [1.573],
# [3.366], [2.596], [2.53], [1.221], [2.827],
# [3.465], [1.65], [2.904], [1.3]], dtype=np.float32)
plt.plot(x_train, y_train, 'r.')
plt.show()
x_train = torch.from_numpy(x_train)
y_train = torch.from_numpy(y_train)
nn.Linear
Applies a linear transformation to the incoming data: $y = xA^T + b$
# Linear Regression Model
class LinearRegression(nn.Module):
def __init__(self):
super(LinearRegression, self).__init__()
self.linear = nn.Linear(1, 1) # input and output is 1 dimension
def forward(self, x):
out = self.linear(x)
return out
model = LinearRegression()
# Define Loss and Optimizatioin function
criterion = nn.MSELoss()
optimizer = optim.SGD(model.parameters(), lr=1e-9)#1e-4)
num_epochs = 1000
for epoch in range(num_epochs):
inputs = Variable(x_train)
target = Variable(y_train)
# forward
out = model(inputs)
loss = criterion(out, target)
# backward
optimizer.zero_grad()
loss.backward()
optimizer.step()
if (epoch+1) % 50 == 0:
print('Epoch[{}/{}], loss: {:.6f}'
.format(epoch+1, num_epochs, loss.data.item()))
Epoch[50/1000], loss: 3798.923096 Epoch[100/1000], loss: 2773.092773 Epoch[150/1000], loss: 2336.129639 Epoch[200/1000], loss: 2150.003906 Epoch[250/1000], loss: 2070.720459 Epoch[300/1000], loss: 2036.951782 Epoch[350/1000], loss: 2022.566162 Epoch[400/1000], loss: 2016.437866 Epoch[450/1000], loss: 2013.828125 Epoch[500/1000], loss: 2012.717407 Epoch[550/1000], loss: 2012.243286 Epoch[600/1000], loss: 2012.041260 Epoch[650/1000], loss: 2011.954956 Epoch[700/1000], loss: 2011.918335 Epoch[750/1000], loss: 2011.904053 Epoch[800/1000], loss: 2011.897217 Epoch[850/1000], loss: 2011.894409 Epoch[900/1000], loss: 2011.893311 Epoch[950/1000], loss: 2011.892456 Epoch[1000/1000], loss: 2011.890991
where :`N` is the batch size.
model.eval()
LinearRegression( (linear): Linear(in_features=1, out_features=1, bias=True) )
predict = model(Variable(x_train))
predict = predict.data.numpy()
plt.plot(x_train.numpy(), y_train.numpy(), 'ro', label='Original data')
plt.plot(x_train.numpy(), predict, 'b-s', label='Fitting Line')
plt.xlabel('X', fontsize= 20)
plt.ylabel('y', fontsize= 20)
plt.legend()
plt.show()
使用pytorch建立卷积神经网络并处理MNIST数据。 https://computational-communication.com/pytorch-mnist/