This IPython Notebook provides a brief overview of how to install and configure the book's virtual machine for maximal enjoyment in following along with the numbered examples from Mining the Social Web (2nd Edition). You are very strongly encouraged to install the virtual machine as a development environment instead of using your existing Python installation because there are some non-trivial configuration management issues that are involved in installing IPython Notebook and its dependencies along with various 3rd party Python packages that are used throughout the book, and the need to support users across multiple platforms only exacerbates the complexity. In short, the virtual machine experience is intended to provide all readers and consumers of this book's source code with the best possible experience. Even if you are an expert in working with Python developer tools, you will still likely save some time by taking advantage of the book's virtual machine experience on your first pass through the book, so give it a try.
The remainder of this notebook provides a brief overview of how to install the virtual machine along with a few important notes to keep in mind each step of the way.
In the somewhat unlikely event that you've somehow stumbled across this notebook outside of its context on GitHub, you can find the full source code repository here.
The following screencast is less than 3 minutes and illustrates the step-by-step instructions below for installing the Mining the Social Web virtual machine.
In order to start the Vagrant-based virtual machine for Mining the Social Web, there are just a few easy steps to follow:
vagrant up, it takes ~20 minutes on average to download the ~323MB base image and then download/install critical updates and 3rd party packages. This time is largely dependent upon your Internet connection speed and hardware.
[2013-07-27T01:45:27+00:00] INFO: runit_service[ipython] enabled
[2013-07-27T01:45:27+00:00] INFO: Chef Run complete in 1553.918395 seconds
[2013-07-27T01:45:27+00:00] DEBUG: Cleaning the checksum cache
[2013-07-27T01:45:27+00:00] INFO: Running report handlers
[2013-07-27T01:45:27+00:00] INFO: Report handlers complete
[2013-07-27T01:45:27+00:00] DEBUG: Exiting
vagrant sshinto the virtual machine and get more comfortable working with developer tools in a terminal environment. Take it one step at a time.
You are strongly encouraged to peruse Vagrant's documentation online to get a basic understanding of how it works. Once you have a general working knowledge, the following commands are likely to be the primary ones that you'll want to know how to use. Anytime you run these commands, it needs to be in the top level source code directory in which your Vagrantfile is located. Your Vagrantfile provides the basis for which the commands operate.
vagrant up- Starts your virtual machine.
vagrant status- Tells you if your virtual machine is running.
vagrant suspend- Saves the state of your virtual machine. (Similar to putting it to sleep.)
vagrant resume- Restores a suspended virtual machine. (Similar to waking it up from sleep.)
vagrant up, a suspend/resume operation only takes a few seconds.
vagrant halt- Shuts down your virtual machine.
vagrant uponly takes about one minute to complete.
vagrant destroy- Destroys your virtual machine to the state of its base image.
vagrant uptakes the full ~20 minutes to complete.
vagrant ssh- Logs you into your virtual machine over SSH and provides a terminal.
In the event that you've never used a version control system such as Git to obtain or manage source code, be assured that it's well worth the investment to learn Git fundamentals. The first two chapters of http://gitscm.com/ are particularly worth the 15 or so minutes that it takes to complete, and you'll also find that Stack Overflow also contains a plethora of answers to common Git questions and best practice guidelines.
The absolute minimum Git skills that you'll want to know for consuming the source code of this book include:
git clone- With git, you clone a repository to get its source code, and you'll need to
git clone [email protected]:ptwobrussell/Mining-the-Social-Web-2nd-Edition.gitto get source code for this repository. (The repository URL is provided in the right margin of https://github.com/ptwobrussell/Mining-the-Social-Web-2nd-Edition if you missed it.)
git status- You can check the status of your repository by typing
git statusin the source code directory that you cloned. A common reason that you'll use
git statusis to determine if there are updates in the remote repository that you can pull down.
git pull- Whenever the maintainer of a repository makes an update, you can pull the update by simply typing
git pullin the source code directory that you cloned.
git checkout- You can use
git checkoutto checkout a file you may have modified to restore it to its previous state.
As you become more comfortable with Git, you may want to fork a Git repository, commit changes to it, and push your changes to the master branch on GitHub. Consult http://gitscm.com/ for more information on how to do these things when you are ready to make that additional leap.
You are certainly able to download a zip archive of a GitHub repository's source code (look for the "Download ZIP" button in the right margin), but doing so would be a bit ironic. This book is all about the social web, and you'd be avoiding the premier social coding platform that hosts its project code. GitHub is inherently social, and there are benefits to participating that you can't gain any other way besides plugging in, being part of the community, and applying some Git fundamentals to contribute from time to time. Forking code, opening pull requests, and otherwise contributing within the boundaries of the GitHub platform tooling is much easier than you might initially think because GitHub delivers such a tremendous user experience. Take a few extra minutes to checkout the source code from GitHub instead of downloading a zip archive. You'll be glad that you took those steps.
The following screenshots may be helpful as references for Windows users who are installing Git for Windows.
Windows users should opt to install the developer tools while installing Git for Windows in order to get SSH, which allows Vagrant's "vagrant ssh" command to seamlessly work.
Logging into your virtual machine (should you need or desire to do so for advanced troubleshooting) is as easy as "vagrant ssh" so long as you have an SSH client in your path
Once you have run "vagrant up" and your virtual machine is up and running, you essentially operate as though the virtual machine is just a piece of software running like any other. For example, you'll operate in your web browser just like normal to access IPython Notebook, which is where you'll spend all of your time. The nice thing about the virtual machine experience is that it allows you to use your host operating system as usual, although it encapsulates all of the messy configuration management details to a well-known and highly controlled environment.
Please file tickets here on GitHub if you experience any troubles whatsoever, and thanks again for your interest in Mining the Social Web (2nd Edition). The goal in providing you with a completely turn-key machine experience is so that you can get the most out of the book and its source code -- not to divert your attention into unnecessary system configuration issues. Feedback on ways to improve this experience is always welcome, and pull requests are especially appreciated.