Table of ContentsLearning MLChapter 1.01 - Help


Chapter 1 - Introduction

This book is for:

  • those who are learning (or want to learn, hello! 👋) machine learning, but don't necessarily have a mathematical or statistical background
  • those who have seen or used so-called 'black box' machine learning services online, but want to understand what is actually going on inside the box
  • those who conceptually understand the inputs and outputs, but want to know how one leads to the other
  • (and finally,) those who have been building machine learning models with a graphical tool, but want to extend their skills to implement models in code.

If that sounds like you, congratulations(!) you've come to the right place. If not, hopefully you still find this a valuable resource.

What is Machine Learning?

Well done for making it this far. If you've judged the book by its cover then hopefully you're indeed learning machine learning - so it's time to set some definitions. By using this resource, hopefully you'll be in a better position to explain machine learning and teach others (technical and non-technical people alike). To help with this, here's a handy definition:

Machine Learning is a set of computational methods which allow a computer (machine) to automatically find (learn) relationships within data.

Why is this useful? As the relationship are learned, they are collected into a model. A model is a simplified (but as we'll discover, not always simple) representation of the real world. Models (including machine learning models) are useful because, by generalising our understanding of the real world, they allow us to predict future events and discover structure that was previously unknown.

What is machine learning used for?

Machine learning has countless uses, and the list is ever growing. Some popular examples include:

  • Retaining customers proactively by predicting customer churn
  • Making better credit decisions by predicting risk of default
  • Supporting students to complete their studies by predicting risk of drop-out
  • Optimising maintenance schedules by predicting equipment failures
  • Improvcing customer service availability by predicting inbound calls

What are the different types of machine learning?

As noted, machine learning is a set of computational methods, and this set can be primarily broken into two types: supervised machine learning and unsupervised machine learning.

Supervised machine learning is useful for prediction, and requires observations to have labels. An observation is a single record (or row) in the data, which represents some real-world object or event (such as a student, a loan, or a movie). A label is the data element you are trying to predict (such as dropped out, application status or box-office earnings).

Supervised methods are then further broken down into classification and regression. Classification uses labels with a discrete value (such as dropped out, which can either be True or False depending on whether that student completed their course or not, or application status, which could be one of accepted, declined or withdrawn). Regression uses labels with a continuous value (such as box-office earnings, a dollar amount - but equally could include any other number such as age or days between failures).

Unsupervised machine learning is useful for discovering structure, and does not require obvervations to have labels.

For this book, we are mainly concerned about supervised machine learning and how it is used for prediction, although as we will see, unsupervised methods can be very useful along the way.

How is machine learning performed?

There are many different software tools available that implement the methods described above. Some of these are graphical and some code-based. Some are free and open source, and others are absurdly expensive and enterprise licensed.

Two of the most popular tools for machine learning are the programming languages R and Python. Both languages are free and open source, and have strong support for machine learning with extensive package libraries and equally large communities. Other programming languages do have support for machine learning, and so the focus of this book is not to explicitly teach one particular language's machine learning implementations - but to teach the general usage of the methods agnosticly.

For this reason, code examples in this book include either R and Python code, and hopefully over time this will extend to other languages as necessary.


Table of ContentsLearning MLChapter 1.01 - Help