#!/usr/bin/env python # coding: utf-8 # # # *This notebook contains an excerpt from the [Python Data Science Handbook](http://shop.oreilly.com/product/0636920034919.do) by Jake VanderPlas; the content is available [on GitHub](https://github.com/jakevdp/PythonDataScienceHandbook).* # # *The text is released under the [CC-BY-NC-ND license](https://creativecommons.org/licenses/by-nc-nd/3.0/us/legalcode), and code is released under the [MIT license](https://opensource.org/licenses/MIT). If you find this content useful, please consider supporting the work by [buying the book](http://shop.oreilly.com/product/0636920034919.do)!* # # < [Further Resources](04.15-Further-Resources.ipynb) | [Contents](Index.ipynb) | [What Is Machine Learning?](05.01-What-Is-Machine-Learning.ipynb) > # # Machine Learning # In many ways, machine learning is the primary means by which data science manifests itself to the broader world. # Machine learning is where these computational and algorithmic skills of data science meet the statistical thinking of data science, and the result is a collection of approaches to inference and data exploration that are not about effective theory so much as effective computation. # # The term "machine learning" is sometimes thrown around as if it is some kind of magic pill: *apply machine learning to your data, and all your problems will be solved!* # As you might expect, the reality is rarely this simple. # While these methods can be incredibly powerful, to be effective they must be approached with a firm grasp of the strengths and weaknesses of each method, as well as a grasp of general concepts such as bias and variance, overfitting and underfitting, and more. # # This chapter will dive into practical aspects of machine learning, primarily using Python's [Scikit-Learn](http://scikit-learn.org) package. # This is not meant to be a comprehensive introduction to the field of machine learning; that is a large subject and necessitates a more technical approach than we take here. # Nor is it meant to be a comprehensive manual for the use of the Scikit-Learn package (for this, you can refer to the resources listed in [Further Machine Learning Resources](05.15-Learning-More.ipynb)). # Rather, the goals of this chapter are: # # - To introduce the fundamental vocabulary and concepts of machine learning. # - To introduce the Scikit-Learn API and show some examples of its use. # - To take a deeper dive into the details of several of the most important machine learning approaches, and develop an intuition into how they work and when and where they are applicable. # # Much of this material is drawn from the Scikit-Learn tutorials and workshops I have given on several occasions at PyCon, SciPy, PyData, and other conferences. # Any clarity in the following pages is likely due to the many workshop participants and co-instructors who have given me valuable feedback on this material over the years! # # Finally, if you are seeking a more comprehensive or technical treatment of any of these subjects, I've listed several resources and references in [Further Machine Learning Resources](05.15-Learning-More.ipynb). # # < [Further Resources](04.15-Further-Resources.ipynb) | [Contents](Index.ipynb) | [What Is Machine Learning?](05.01-What-Is-Machine-Learning.ipynb) >