This notebook explains the many benefits Jupyter, the IRKernel and conda can provide for data scientists working in R.
Jupyter, previously called IPython, is already widely adopted by data scientists, researchers, and analysts. Jupyter's notebook user interface enables mixing executable code with narrative text, equations, interactive visualizations, and images to enhance team collaboration and advance the state of reproducible research and training. Jupyter began with Python and now has kernels for 50 different languages, and the IRKernel is the native R kernel for Jupyter.
Data scientists, researchers, and analysts use the conda package manager to install and organize project dependencies. With conda they can easily build and share metapackages, which are downloadable bundles of packages. Conda works with Linux, OS X, and Windows, and is language agnostic, so we can use it with any programming language and with projects that depend on multiple languages.
Let's use conda and Jupyter to start a data science project in R.
The Anaconda team has created an "R Essentials" bundle with the IRKernel and over 80 of the most used R packages for data science, including dplyr
, shiny
, ggplot2
, tidyr
, caret
and nnet
.
Downloading "R Essentials" requires conda. Miniconda includes conda, Python, and a few other necessary packages, while Anaconda includes all this and over 200 of the most popular Python packages for science, math, engineering, and data analysis. Users may install all of Anaconda at once, or they may install Miniconda at first and then use conda to install any other packages they need, including any of the packages in Anaconda.
Once you have conda, you may install "R Essentials" into the current environment:
conda install -c r r-essentials
or create a new environment just for "R essentials":
conda create -n my-r-env -c r r-essentials
Jupyter provides a great notebook interface to write your analysis and share it with your peers. Open a shell and run this command to start the Jupyter notebook interface in your browser:
jupyter notebook
You can immediately write and run R code in the notebook cells.
Now you can:
library(dplyr)
iris
:iris
Sepal.Length | Sepal.Width | Petal.Length | Petal.Width | Species | |
---|---|---|---|---|---|
1 | 5.1 | 3.5 | 1.4 | 0.2 | setosa |
2 | 4.9 | 3 | 1.4 | 0.2 | setosa |
3 | 4.7 | 3.2 | 1.3 | 0.2 | setosa |
4 | 4.6 | 3.1 | 1.5 | 0.2 | setosa |
5 | 5 | 3.6 | 1.4 | 0.2 | setosa |
6 | 5.4 | 3.9 | 1.7 | 0.4 | setosa |
7 | 4.6 | 3.4 | 1.4 | 0.3 | setosa |
8 | 5 | 3.4 | 1.5 | 0.2 | setosa |
9 | 4.4 | 2.9 | 1.4 | 0.2 | setosa |
10 | 4.9 | 3.1 | 1.5 | 0.1 | setosa |
11 | 5.4 | 3.7 | 1.5 | 0.2 | setosa |
12 | 4.8 | 3.4 | 1.6 | 0.2 | setosa |
13 | 4.8 | 3 | 1.4 | 0.1 | setosa |
14 | 4.3 | 3 | 1.1 | 0.1 | setosa |
15 | 5.8 | 4 | 1.2 | 0.2 | setosa |
16 | 5.7 | 4.4 | 1.5 | 0.4 | setosa |
17 | 5.4 | 3.9 | 1.3 | 0.4 | setosa |
18 | 5.1 | 3.5 | 1.4 | 0.3 | setosa |
19 | 5.7 | 3.8 | 1.7 | 0.3 | setosa |
20 | 5.1 | 3.8 | 1.5 | 0.3 | setosa |
21 | 5.4 | 3.4 | 1.7 | 0.2 | setosa |
22 | 5.1 | 3.7 | 1.5 | 0.4 | setosa |
23 | 4.6 | 3.6 | 1 | 0.2 | setosa |
24 | 5.1 | 3.3 | 1.7 | 0.5 | setosa |
25 | 4.8 | 3.4 | 1.9 | 0.2 | setosa |
26 | 5 | 3 | 1.6 | 0.2 | setosa |
27 | 5 | 3.4 | 1.6 | 0.4 | setosa |
28 | 5.2 | 3.5 | 1.5 | 0.2 | setosa |
29 | 5.2 | 3.4 | 1.4 | 0.2 | setosa |
30 | 4.7 | 3.2 | 1.6 | 0.2 | setosa |
31 | 4.8 | 3.1 | 1.6 | 0.2 | setosa |
32 | 5.4 | 3.4 | 1.5 | 0.4 | setosa |
33 | 5.2 | 4.1 | 1.5 | 0.1 | setosa |
34 | 5.5 | 4.2 | 1.4 | 0.2 | setosa |
35 | 4.9 | 3.1 | 1.5 | 0.2 | setosa |
36 | 5 | 3.2 | 1.2 | 0.2 | setosa |
37 | 5.5 | 3.5 | 1.3 | 0.2 | setosa |
38 | 4.9 | 3.6 | 1.4 | 0.1 | setosa |
39 | 4.4 | 3 | 1.3 | 0.2 | setosa |
40 | 5.1 | 3.4 | 1.5 | 0.2 | setosa |
41 | 5 | 3.5 | 1.3 | 0.3 | setosa |
42 | 4.5 | 2.3 | 1.3 | 0.3 | setosa |
43 | 4.4 | 3.2 | 1.3 | 0.2 | setosa |
44 | 5 | 3.5 | 1.6 | 0.6 | setosa |
45 | 5.1 | 3.8 | 1.9 | 0.4 | setosa |
46 | 4.8 | 3 | 1.4 | 0.3 | setosa |
47 | 5.1 | 3.8 | 1.6 | 0.2 | setosa |
48 | 4.6 | 3.2 | 1.4 | 0.2 | setosa |
49 | 5.3 | 3.7 | 1.5 | 0.2 | setosa |
50 | 5 | 3.3 | 1.4 | 0.2 | setosa |
51 | 7 | 3.2 | 4.7 | 1.4 | versicolor |
52 | 6.4 | 3.2 | 4.5 | 1.5 | versicolor |
53 | 6.9 | 3.1 | 4.9 | 1.5 | versicolor |
54 | 5.5 | 2.3 | 4 | 1.3 | versicolor |
55 | 6.5 | 2.8 | 4.6 | 1.5 | versicolor |
56 | 5.7 | 2.8 | 4.5 | 1.3 | versicolor |
57 | 6.3 | 3.3 | 4.7 | 1.6 | versicolor |
58 | 4.9 | 2.4 | 3.3 | 1 | versicolor |
59 | 6.6 | 2.9 | 4.6 | 1.3 | versicolor |
60 | 5.2 | 2.7 | 3.9 | 1.4 | versicolor |
61 | 5 | 2 | 3.5 | 1 | versicolor |
62 | 5.9 | 3 | 4.2 | 1.5 | versicolor |
63 | 6 | 2.2 | 4 | 1 | versicolor |
64 | 6.1 | 2.9 | 4.7 | 1.4 | versicolor |
65 | 5.6 | 2.9 | 3.6 | 1.3 | versicolor |
66 | 6.7 | 3.1 | 4.4 | 1.4 | versicolor |
67 | 5.6 | 3 | 4.5 | 1.5 | versicolor |
68 | 5.8 | 2.7 | 4.1 | 1 | versicolor |
69 | 6.2 | 2.2 | 4.5 | 1.5 | versicolor |
70 | 5.6 | 2.5 | 3.9 | 1.1 | versicolor |
71 | 5.9 | 3.2 | 4.8 | 1.8 | versicolor |
72 | 6.1 | 2.8 | 4 | 1.3 | versicolor |
73 | 6.3 | 2.5 | 4.9 | 1.5 | versicolor |
74 | 6.1 | 2.8 | 4.7 | 1.2 | versicolor |
75 | 6.4 | 2.9 | 4.3 | 1.3 | versicolor |
76 | 6.6 | 3 | 4.4 | 1.4 | versicolor |
77 | 6.8 | 2.8 | 4.8 | 1.4 | versicolor |
78 | 6.7 | 3 | 5 | 1.7 | versicolor |
79 | 6 | 2.9 | 4.5 | 1.5 | versicolor |
80 | 5.7 | 2.6 | 3.5 | 1 | versicolor |
81 | 5.5 | 2.4 | 3.8 | 1.1 | versicolor |
82 | 5.5 | 2.4 | 3.7 | 1 | versicolor |
83 | 5.8 | 2.7 | 3.9 | 1.2 | versicolor |
84 | 6 | 2.7 | 5.1 | 1.6 | versicolor |
85 | 5.4 | 3 | 4.5 | 1.5 | versicolor |
86 | 6 | 3.4 | 4.5 | 1.6 | versicolor |
87 | 6.7 | 3.1 | 4.7 | 1.5 | versicolor |
88 | 6.3 | 2.3 | 4.4 | 1.3 | versicolor |
89 | 5.6 | 3 | 4.1 | 1.3 | versicolor |
90 | 5.5 | 2.5 | 4 | 1.3 | versicolor |
91 | 5.5 | 2.6 | 4.4 | 1.2 | versicolor |
92 | 6.1 | 3 | 4.6 | 1.4 | versicolor |
93 | 5.8 | 2.6 | 4 | 1.2 | versicolor |
94 | 5 | 2.3 | 3.3 | 1 | versicolor |
95 | 5.6 | 2.7 | 4.2 | 1.3 | versicolor |
96 | 5.7 | 3 | 4.2 | 1.2 | versicolor |
97 | 5.7 | 2.9 | 4.2 | 1.3 | versicolor |
98 | 6.2 | 2.9 | 4.3 | 1.3 | versicolor |
99 | 5.1 | 2.5 | 3 | 1.1 | versicolor |
100 | 5.7 | 2.8 | 4.1 | 1.3 | versicolor |
101 | 6.3 | 3.3 | 6 | 2.5 | virginica |
102 | 5.8 | 2.7 | 5.1 | 1.9 | virginica |
103 | 7.1 | 3 | 5.9 | 2.1 | virginica |
104 | 6.3 | 2.9 | 5.6 | 1.8 | virginica |
105 | 6.5 | 3 | 5.8 | 2.2 | virginica |
106 | 7.6 | 3 | 6.6 | 2.1 | virginica |
107 | 4.9 | 2.5 | 4.5 | 1.7 | virginica |
108 | 7.3 | 2.9 | 6.3 | 1.8 | virginica |
109 | 6.7 | 2.5 | 5.8 | 1.8 | virginica |
110 | 7.2 | 3.6 | 6.1 | 2.5 | virginica |
111 | 6.5 | 3.2 | 5.1 | 2 | virginica |
112 | 6.4 | 2.7 | 5.3 | 1.9 | virginica |
113 | 6.8 | 3 | 5.5 | 2.1 | virginica |
114 | 5.7 | 2.5 | 5 | 2 | virginica |
115 | 5.8 | 2.8 | 5.1 | 2.4 | virginica |
116 | 6.4 | 3.2 | 5.3 | 2.3 | virginica |
117 | 6.5 | 3 | 5.5 | 1.8 | virginica |
118 | 7.7 | 3.8 | 6.7 | 2.2 | virginica |
119 | 7.7 | 2.6 | 6.9 | 2.3 | virginica |
120 | 6 | 2.2 | 5 | 1.5 | virginica |
121 | 6.9 | 3.2 | 5.7 | 2.3 | virginica |
122 | 5.6 | 2.8 | 4.9 | 2 | virginica |
123 | 7.7 | 2.8 | 6.7 | 2 | virginica |
124 | 6.3 | 2.7 | 4.9 | 1.8 | virginica |
125 | 6.7 | 3.3 | 5.7 | 2.1 | virginica |
126 | 7.2 | 3.2 | 6 | 1.8 | virginica |
127 | 6.2 | 2.8 | 4.8 | 1.8 | virginica |
128 | 6.1 | 3 | 4.9 | 1.8 | virginica |
129 | 6.4 | 2.8 | 5.6 | 2.1 | virginica |
130 | 7.2 | 3 | 5.8 | 1.6 | virginica |
131 | 7.4 | 2.8 | 6.1 | 1.9 | virginica |
132 | 7.9 | 3.8 | 6.4 | 2 | virginica |
133 | 6.4 | 2.8 | 5.6 | 2.2 | virginica |
134 | 6.3 | 2.8 | 5.1 | 1.5 | virginica |
135 | 6.1 | 2.6 | 5.6 | 1.4 | virginica |
136 | 7.7 | 3 | 6.1 | 2.3 | virginica |
137 | 6.3 | 3.4 | 5.6 | 2.4 | virginica |
138 | 6.4 | 3.1 | 5.5 | 1.8 | virginica |
139 | 6 | 3 | 4.8 | 1.8 | virginica |
140 | 6.9 | 3.1 | 5.4 | 2.1 | virginica |
141 | 6.7 | 3.1 | 5.6 | 2.4 | virginica |
142 | 6.9 | 3.1 | 5.1 | 2.3 | virginica |
143 | 5.8 | 2.7 | 5.1 | 1.9 | virginica |
144 | 6.8 | 3.2 | 5.9 | 2.3 | virginica |
145 | 6.7 | 3.3 | 5.7 | 2.5 | virginica |
146 | 6.7 | 3 | 5.2 | 2.3 | virginica |
147 | 6.3 | 2.5 | 5 | 1.9 | virginica |
148 | 6.5 | 3 | 5.2 | 2 | virginica |
149 | 6.2 | 3.4 | 5.4 | 2.3 | virginica |
150 | 5.9 | 3 | 5.1 | 1.8 | virginica |
iris %>%
group_by(Species) %>%
summarise(Sepal.Width.Avg = mean(Sepal.Width)) %>%
arrange(Sepal.Width.Avg)
Species | Sepal.Width.Avg | |
---|---|---|
1 | versicolor | 2.77 |
2 | virginica | 2.974 |
3 | setosa | 3.428 |
library(ggplot2)
ggplot(data=iris, aes(x=Sepal.Length, y=Sepal.Width, color=Species)) + geom_point(size=3)
For our users' convenience, we have packaged some of the most used R packages for data science in "R Essentials". It is also very easy to create your own custom set of R packages to share with peers with the conda metapackage
command. For example, to provide a download called custom-r-bundle with only the libraries used in our example notebook, just create the metapackage:
conda metapackage custom-r-bundle 0.1.0 --dependencies r-irkernel jupyter r-ggplot2 r-dplyr --summary "My custom R bundle"
Share it with colleagues by uploading it to Anaconda.org:
conda install anaconda-client
anaconda login
anaconda upload custom-r-bundle-0.1.0-0.tar.bz2
Now anyone can get the bundle with those packages and dependencies by replacing "anacondauser" with your Anaconda.org username and running this command:
conda install -c anacondauser custom-r-bundle
Jupyter can convert a notebook into an online slide deck for talks and tutorials.
To convert a notebook into a reveal.js presentation, set "Cell Toolbar" to "Slideshow":
And convert:
jupyter nbconvert my_r_notebook.ipynb --to slides --post serve