What is Machine Learning?

Machine learning is a relatively new part of programming that is about making models with known data, and being able to use these models with new data to predict outcomes. Machine Learning is extremely linked with statistics, econometrics and optimisation, but we will be calling it machine learning for now. There are two main types of machine learning:

  • Supervised Learning - if we have pre-categorised data and want to build a model to predict from known features
  • Unsupervised Learning - if we have uncategorised data nad we want to group together similar data points

We will learn algorithms from both categories, and the weaknesses and strengths of each one.

What Libraries Will We Use?

The main library we will be using will be scikit-learn, but we will look into other libraries like tensorflow and NLTK in other sections (neural networks and natural language processing) that utilise a lot of the same algorithms and thinking that machine learning does.

What's The Difference Between Machine Learning And Statistics?

Machine learning is more about building models for prediction, whereas statistics libraries in Python are more to do with analysis. In Machine Learning we want to be able to quickly get predictions based from a model and be able to deploy it in an quick and easy way. If we're coming from a statistical viewpoint, we want to know how accurate this model is, along with other tests that check how strong it is, with the focus more on testing for validity and less on implementation.

What Kind Of Problems Is Machine Learning Used For?

Machine learning is used all the time in finance for fraud detection and financial modeling, for IT services (such as Google autocomplete results), and image and pattern recognition. Machine learning is a massive part of data anaylsis and science, and although you might not immediately know what you want to use these solutions for, there's a strong chance you'll run into a situation where using one of the methods will give you a shortcut.