This is one of the 100 recipes of the IPython Cookbook, the definitive guide to high-performance scientific computing and data science in Python.
UPDATE (2014-09-29): in newer versions of rpy2, the IPython extension with the R magic is
rpy2.ipython and not
rmagic as stated in the book.
There are three steps to use R from IPython. First, install R and rpy2 (R to Python interface). Of course, you only need to do this step once. Then, to use R in an IPython session, you need to load the IPython R extension.
rpy2 does not appear to work well on Windows. We recommend using Linux or Mac OS X.
To install R and rpy2 on Ubuntu, run the following commands:
sudo apt-get install r-base-dev sudo apt-get install python-rpy2
Here, we will use the following workflow. First, we load data from Python. Then, we use R to design and fit a model, and to make some plots in the IPython notebook. We could also load data from R, or design and fit a statistical model with Python's statsmodels package, etc. In particular, the analysis we do here could be done entirely in Python, without resorting to the R language. This recipe just shows the basics of R and illustrates how R and Python can play together within an IPython session.
import statsmodels.datasets as sd
data = sd.longley.load_pandas()
yas the exogeneous (independent) and endogenous (dependent) variables, respectively. The endogenous variable quantifies the total employment in the country.
y, x = data.endog, data.exog
x['TOTEMP'] = y
%R -i var1,var2magic. Then, we can call R's
gnp = x['GNP'] totemp = x['TOTEMP']
%R -i totemp,gnp plot(gnp, totemp)
lmfunction lets us perform a linear regression. Here, we want to express
totemp(total employement) as a function of the country's GNP.
%%R fit <- lm(totemp ~ gnp); # Least-squares regression print(fit$coefficients) # Display the coefficients of the fit. plot(gnp, totemp) # Plot the data points. abline(fit) # And plot the linear regression.