# Machine Learning Overview¶

### Preliminaries¶

• Goal
• Top-level overview of machine learning
• Materials
• Study Bishop pp. 1-4
• Study this notebook

### What is Machine Learning?¶

• Machine Learning relates to building models from data.
• Suppose we want to make a model for a complex process about which we have little knowledge (so hand-programming is not possible).
• Solution: Get the computer to program itself by showing it examples of the behavior that we want.
• Practically, we choose a library of models, and write a program that picks a model and tunes it to fit the data.
• Criterion: a good model generalizes well to unseen data from the same process.
• This method is known in various scientific communities under different names such as machine learning, statistical inference, system identification, data mining, source coding, data compression, etc.

### Machine Learning is Difficult¶

• Modeling (Learning) Problems
• Is there any regularity in the data anyway?
• What is our prior knowledge and how to express it mathematically?
• How to pick the model library?
• How to tune the models to the data?
• How to measure the generalization performance?
• Quality of Observed Data
• Not enough data
• Too much data?
• Available data may be messy (measurement noise, missing data points, outliers)

### A Machine Learning Taxonomy¶

• Supervised Learning: Given examples of inputs and corresponding desired outputs, predict outputs on future inputs.
• Examples: classification, regression, time series prediction
• Unsupervised Learning: (a.k.a. density estimation). Given only inputs, automatically discover representations, features, structure, etc.
• Examples: clustering, outlier detection, compression
• Reinforcement Learning: Given sequences of inputs, actions from a fixed set, and scalar rewards/punishments, learn to select action sequences in a way that maximizes expected reward, e.g. chess and robotics. (This is more akin to learning how to design good experiments and is not covered in this course.)
• Other stuff, like Preference Learning, learning to rank, etc. (also not covered in this course). Note that many machine learning problems can be (re-)formulated as special cases of either a supervised or unsupervised problem, which are both covered in this class.

### Supervised Learning¶

• Given observations $D=\{(x_1,y_1),\dots,(x_N,y_N)\}$, the goal is to estimate the conditional distribution $p(y|x)$.
##### Classification¶

• The target variable $y$ is a discrete-valued vector representing class labels
##### Regression¶

• Same problem statement as classification but now the target variable is a real-valued vector.

### Unsupervised Learning¶

Given data $D=\{x_1,\ldots,x_N\}$, model the (unconditional) probability distribution $p(x)$ (a.k.a. density estimation).

##### Clustering¶

• Group data into clusters such that all data points in a cluster have similar properties.
##### Compression / dimensionality reduction¶

• Output from coder is much smaller in size than original, but if coded signal if further processed by a decoder, then the result is very close (or exactly equal) to the original.

### Some Machine Learning Applications¶

• computer speech recognition, speaker recognition
• face recognition, iris identification
• printed and handwritten text parsing
• financial prediction, outlier detection (credit-card fraud)
• user preference modeling (amazon); modeling of human perception
• modeling of the web (google)
• machine translation
• medical expert systems for disease diagnosis (e.g., mammogram)
• strategic games (chess, go, backgammon)
• any 'knowledge-poor' but 'data-rich' problem
In [2]:
open("../../styles/aipstyle.html") do f display("text/html", readstring(f)) end