This is one of the 100 recipes of the IPython Cookbook, the definitive guide to high-performance scientific computing and data science in Python.
UPDATE (2014-09-29): in newer versions of rpy2, the IPython extension with the R magic is rpy2.ipython
and not rmagic
as stated in the book.
There are three steps to use R from IPython. First, install R and rpy2 (R to Python interface). Of course, you only need to do this step once. Then, to use R in an IPython session, you need to load the IPython R extension.
%load_ext rpy2.ipython
first.rpy2 does not appear to work well on Windows. We recommend using Linux or Mac OS X.
To install R and rpy2 on Ubuntu, run the following commands:
sudo apt-get install r-base-dev
sudo apt-get install python-rpy2
Here, we will use the following workflow. First, we load data from Python. Then, we use R to design and fit a model, and to make some plots in the IPython notebook. We could also load data from R, or design and fit a statistical model with Python's statsmodels package, etc. In particular, the analysis we do here could be done entirely in Python, without resorting to the R language. This recipe just shows the basics of R and illustrates how R and Python can play together within an IPython session.
import statsmodels.datasets as sd
data = sd.longley.load_pandas()
%load_ext rpy2.ipython
x
and y
as the exogeneous (independent) and endogenous (dependent) variables, respectively. The endogenous variable quantifies the total employment in the country.data.endog_name, data.exog_name
y, x = data.endog, data.exog
x
DataFrame.x['TOTEMP'] = y
x
%R -i var1,var2
magic. Then, we can call R's plot
command.gnp = x['GNP']
totemp = x['TOTEMP']
%R
%R -i totemp,gnp plot(gnp, totemp)
lm
function lets us perform a linear regression. Here, we want to express totemp
(total employement) as a function of the country's GNP.%%R
fit <- lm(totemp ~ gnp); # Least-squares regression
print(fit$coefficients) # Display the coefficients of the fit.
plot(gnp, totemp) # Plot the data points.
abline(fit) # And plot the linear regression.
You'll find all the explanations, figures, references, and much more in the book (to be released later this summer).
IPython Cookbook, by Cyrille Rossant, Packt Publishing, 2014 (500 pages).