stata_kernel
Jupyter notebook¶This Jupyter notebook is an example of how you can use Stata in the Jupyter ecosystem using stata_kernel
.
Full documentation, including how to install, is available at https://kylebarron.dev/stata_kernel/.
The Jupyter Notebook is a file format that permits interactive coding with text, code, and results in a single document. You can share a notebook file (with extension .ipynb
), and results will be viewable without running the code, but as long as the recipient also has Jupyter installed, he or she can edit and re-run the code cells.
Jupyter itself is language agnostic, i.e. it permits writing code in any language. This document uses Stata code, but you can also code in Jupyter using Python, R, Julia, Matlab, and SAS.
display "Hello, world!"
Hello, world!
You can run a cell by pressing Ctrl+Enter or Shift+Enter. If a number appears in the brackets to the left of the input cell, that means that the code was successfully run (sometimes a cell doesn't produce any output).
If you don't see Hello, world!
as output, check out the troubleshooting tips.
Let's load the included auto
dataset.
sysuse auto.dta
(1978 Automobile Data)
Now the auto
dataset is in memory.
Nearly all commands that work in Stata work through Jupyter as well. A couple commands that depend on the Graphical User Interface, such as browse
and edit
, only work on Windows.
tabulate foreign headroom
| Headroom (in.) Car type | 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 | Total -----------+----------------------------------------------------------------------------------------+---------- Domestic | 3 10 4 7 13 10 4 1 | 52 Foreign | 1 3 10 6 2 0 0 0 | 22 -----------+----------------------------------------------------------------------------------------+---------- Total | 4 13 14 13 15 10 4 1 | 74
No special syntax is needed to generate graphs. Just write commands like you're used to. The display order of graphs will always be the same as the order in the code.
// Dataset with test scores
use "https://stats.idre.ucla.edu/stat/stata/notes/hsb2", clear
scatter read math, title("Reading score vs Math score")
scatter math science, title("Math score vs Science score")
(highschool and beyond (200 cases))
If you don't want to display a graph, just prefix the command with quietly
.
quietly scatter read math, title("Reading score vs Math score")
stata_kernel
lets you use any format of comments, including //
, ///
, *
, and /*
-*/
, even in an interactive console environment where the Stata command line normally wouldn't accept them.
display "displayed"
// display "comment"
displayed
display "line continuation " /// comment
"comment"
line continuation comment
* display "not displayed"
display "displayed1"
/*
display "displayed2"
*/
display "displayed3"
displayed1 displayed3
stata_kernel
provides autocompletion for locals, globals, variables, scalars, and matrices based on the contents in memory. It also suggests file paths to load or save files. Press Tab while typing to activate it.
Magics are special commands that stata_kernel
provides to give extra functionality, especially regarding the connection with Jupyter.
These commands all start with %
. You can run %help magics
or go here to see a list of available magics. You can also run %magic_name --help
to see the help for any given magic.
In order to prevent confusion, these commands must occur at the beginning of a cell.
%head
, %browse
, %tail
¶%head
, %browse
, and %tail
show a well-formatted portion of the dataset in memory.
%head 5
id | female | race | ses | schtyp | prog | read | write | math | science | socst | |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | 70 | male | white | low | public | general | 57 | 52 | 41 | 47 | 57 |
2 | 121 | female | white | middle | public | vocation | 68 | 59 | 53 | 63 | 61 |
3 | 86 | male | white | high | public | general | 44 | 33 | 54 | 58 | 31 |
4 | 141 | male | white | high | public | vocation | 63 | 44 | 47 | 53 | 56 |
5 | 172 | male | white | middle | public | academic | 47 | 52 | 57 | 53 | 61 |
%help
¶%help
shows the help menu for a given command. The links in this help file are clickable, just like the official Stata documentation. (This command requires internet access.)
%help summarize
|
%locals
, %globals
¶%locals
and %globals
display the local or global macros in the current environment.
local local1 "foo"
local local2 "bar"
local abcd "foo bar"
%locals
abcd: foo bar local2: bar local1: foo
%locals loc
local2: bar local1: foo
%globals
(note: showing first line of global values; run with --verbose) T_gm_fix_span: 0 stata_kernel_graph_counter: 2 S_FNDATE: 17 Jun 2002 08:48 S_FN: https://stats.idre.ucla.edu/stat/stata/notes/hsb2.dta S_ADO: BASE;SITE;.;PERSONAL;PLUS;OLDPLACE;`"/Users/kyle/github/stata/stata-kernel/stata_kernel/ado"' S_level: 95 F1: help advice; F2: describe; F7: save F8: use S_StataSE: SE S_CONSOLE: console S_FLAVOR: Intercooled S_OS: Unix S_MACH: Macintosh (Intel 64-bit)
%html
, %latex
¶%html
and %latex
attempt to display either type of output (not user input). This could be used, for example, with estout
to display several regression results side-by-side.
Note: Jupyter can display a math subset of LaTeX but doesn't support tables. However, it's really easy to export a Jupyter Notebook file to PDF through LaTeX (see File > Export Notebook As > Export Notebook to PDF). In this PDF export, LaTeX tables will be properly displayed.
cap ssc install estout
sysuse auto, clear
eststo clear
eststo: qui regress price mpg rep78
eststo: qui regress price mpg rep78 gear_ratio trunk
eststo: qui regress price mpg rep78 gear_ratio trunk weight displacement
(1978 Automobile Data) (est1 stored) (est2 stored) (est3 stored)
%html
esttab, label title("Regression Table") html
(1) | (2) | (3) | |
Price | Price | Price | |
Mileage (mpg) | -271.6*** | -206.3* | -76.96 |
(-4.70) | (-2.65) | (-0.92) | |
Repair Record 1978 | 667.0 | 767.1* | 899.1** |
(1.95) | (2.17) | (2.76) | |
Gear Ratio | -1289.9 | 1479.7 | |
(-1.38) | (1.30) | ||
Trunk space (cu. ft.) | 12.49 | -110.3 | |
(0.14) | (-1.23) | ||
Weight (lbs.) | 1.140 | ||
(1.00) | |||
Displacement (cu. in.) | 17.82 | ||
(1.88) | |||
Constant | 9657.8*** | 11620.1*** | -5163.3 |
(7.17) | (3.65) | (-0.94) | |
Observations | 69 | 69 | 69 |
t statistics in parentheses
* p < 0.05, ** p < 0.01, *** p < 0.001 |
%show_gui
, %hide_gui
¶On Windows, %show_gui
and %hide_gui
show and hide the traditional Stata Graphical User Interface window. These magics do not work on macOS or Linux because those platforms communicate with Stata in a different manner.
;
-delimited commands¶Often with long commands, such as graphs, using #delimit ;
helps prevent very long lines and helps to keep code more readable. This is supported in stata_kernel
, despite it not being allowed in the normal Stata command-line environment.
sysuse auto, clear
(1978 Automobile Data)
#delimit ;
display "Hello, world!";
Hello, world! delimiter now ;
It's important to note that the ;
-delimiter mode persists across cells. stata_kernel
will expect cells to include ;
for each command, and will raise an error if ;
is missing.
display "Hello, world!"
stata_kernel error: code entered was incomplete. This usually means that a loop or program was not correctly terminated. This can also happen if you are in `#delimit ;` mode and did not end the command with `;`. Use `%delimit` to see the current delimiter mode and use `#delimit cr` to switch back to the default mode where `;` is unnecessary.
You can check the current delimiter with the %delimit
magic.
%delimit
The delimiter is currently: ;
You can switch back to normal line-break delimited commands (i.e. where ;
is unnecessary) with #delimit cr
.
#delimit cr
delimiter now cr
You can start an interactive Mata session by typing mata
. This persists across cells; cells will continue being Mata cells until you run end
to exit the mata session.
You can run the %status
magic to check if you're in Mata or Stata mode.
sysuse auto, clear
(1978 Automobile Data)
mata
------------------------------------------------- mata (type end to exit) -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
%status
stata_kernel 1.9.0 for Stata 15.1 Delimiter: cr Environment: Mata
y = st_data(., "price")
X = st_data(., "mpg trunk")
n = rows(X)
X = X,J(n,1,1)
XpX = quadcross(X, X)
XpXi = invsym(XpX)
b = XpXi*quadcross(X, y)
b'
1 2 3 +----------------------------------------------+ 1 | -220.1648801 43.55851009 10254.94983 | +----------------------------------------------+
end
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
%status
stata_kernel 1.9.0 for Stata 15.1 Delimiter: cr Environment: Stata