- Tuesday and Thursday: 12:30 - 1:45, MS Teams online. See "Overview" web page.
- All students can watch recorded lecture videos through Echo360 in Canvas

- Presentation and discussions of lecture notes and assignments and all other topics of interest to you!

- 7 or 8 Assignments: combinations of python implementations and applications of machine learning algorithms with text, math and visualizations that explain the algorithms and results.
- Check in solutions through Canvas as jupyter notebooks
- No exams or quizzes!

- What is Python?
- What is jupyter?
- And what do you mean by "machine learning"?
- Arrrggghh!!

Supervised Learning

- Given samples of inputs and correct outputs, find a computational model that takes an input and produces approximately correct outputs, even for other inputs.
- For example, predict stock prices, or classify a patient's symptoms into likely diagnoses.

Unsupervised Learning

- Given samples of input data, find similarities, dissimilarities, anomolies, trends, and low-dimensional representations of the data.
- For example, identify network attacks, or a faulty sensor.

Reinforcement Learning

- Given a system or device with observable and action variables, find computational model that generates accepts sequences of observable values and produces sequences of action values that optimize the system's or device's behavior over time.
- For example, learn to take shortest paths, or learn to play chess.

Algorithms we will focus on

- Regression
- Classification
- Clustering
- Q-Learning, Policy Gradients

All using deep learning (artificial neural networks), implemented ourselves mostly using `numpy`

and sometimes using public frameworks, such as `pytorch`

and `tensorflow`

If you have no or just a little experience with python, it may be difficult for you to gain the necessary experience fast enough.

- True or False: Python code must be compiled before being run.
- True or False: Variables in python can be assigned values of any type.
- True or False: Python is to be avoided for big machine learning tasks because execution is so slow.
- True or False: Python is free and available on all platforms.
- True of False: Knowing python will not help you get a job.

Easy to install on your own laptop. The Anaconda distribution is recommended.

Popularity of Python judged by accesses of tutorials on github.

In [ ]:

```
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
plt.plot(np.sin(0.1 * np.arange(100)));
```

Finally, we have

- literate programming
- executable documents
- reproducible results
- super tutorials
- excellent lab notebook for computer scientists
- executable lecture notes!

Github, and other services, can render them in browsers.

How would you do these?

- Assign to variable
`W`

a 5 x 2 matrix (`numpy`

array) of random values between -0.1 and 0.1 - Assign to variable
`X`

a 10 x 5 matrix of random values between 0 and 10. - Multiply two matrices,
`X`

and`W`

. - Take the transpose of
`XW`

. - Multiply all elements of matrix
`X`

by 2 and subtract 1 from all elements.

In [ ]:

```
# 1. Assign to variable W a 5 x 2 matrix (numpy array) of random values between -0.1 and 0.1
```

In [ ]:

```
# 2. Assign to variable X a 10 x 5 matrix of random values between 0 and 10.
```

In [ ]:

```
# 3. Multiply two matrices, X and W.
```

In [ ]:

```
# 4. Take the transpose of XW.
```

In [ ]:

```
# 5. Multiply all elements of matrix X by 2 and subtract 1 from all elements.
```

In [ ]:

```
# Iteratively make small adjustments to an initial matrix `W` to make it converge on values in matrix `Z`.
```

Here is a wild statement. In a high-dimensional space, most of the points lie on the surface of the hypercube shell!

Huh? As discussed at this stackexchange post, consider the volume of the hypercube in $d$-dimensional space with each side of the hypercube being of length 1.

$$ 1 \cdot 1 \cdot 1 \cdot \ldots \cdot 1 = 1^d$$Nothing surprising here.

Now, let's look at the volume of the "interior" of this hypercube. Let's define the interior as the space from 0.01 to 0.99 along each dimension. This would be

$$ 0.98 \cdot 0.98 \cdot 0.98 \ldots \cdot 0.98 = 0.98^d$$So what? Well, let's look at the results of these calculations. First let's consider two dimensions.

In [ ]:

```
0.98**2
```

In [ ]:

```
d = 2
print(f'Total volume {1**d}. Interior volume {0.98**d}')
```

How about 10 dimensions, or 50?

In [ ]:

```
d = 10
print(f'Total volume {1**d}. Interior volume {0.98**d}')
```

In [ ]:

```
d = 50
print(f'Total volume {1**d}. Interior volume {0.98**d}')
```

Okay, this is getting tedious. Let's automate this and calculate interior volume for a range of dimensions up to 100, and plot it.

In [ ]:

```
def interior_volume(d):
return (1 - 0.01 * 2) ** d
dims = np.arange(1, 100, 1)
plt.plot(dims, interior_volume(dims))
plt.xlabel('dimensions')
plt.ylabel('interior volume')
plt.grid('on')
```

Keep going, up to 1000 dimensions.

In [ ]:

```
def interior_volume(d):
return 0.98 ** d
dims = np.arange(1, 1000, 1)
plt.plot(dims, interior_volume(dims))
plt.xlabel('dimensions')
plt.ylabel('interior volume')
plt.grid('on')
plt.text(500, 0.6, 'WHOA!', fontsize=40);
```

In [ ]:

```
interior_volume(1000)
```

Go back to definition of `interior_volume`

and try a thinner shell.