The notebook like an Rmd file, allows us to blend text, code and code outputs.
Just as in Rmd, we can write nicely styled text from vanilla markdown:
We can write inline code
or language sensitively code styled blocks:
# ggplot2 examples
library(ggplot2)
# create factors with value labels
mtcars$gear <- factor(mtcars$gear,levels=c(3,4,5),
labels=c("3gears","4gears","5gears"))
We can write inline equations using latex — $e=mc^2$ as well as across multiple lines:
\begin{align} \dot{x} & = \sigma(y-x) \\ \dot{y} & = \rho x - y - xz \\ \dot{z} & = -\beta z + xy \end{align}Code is entered and executed via code cells. The execution environment is determined by the notebook kernel attached to the notebook.
This notebook has been associated with an R kernel. Which means we can write R code in the cells:
# ggplot2 examples
library(ggplot2)
As in Rmd documents, the state set or packages loaded by executing code in one cell is available to later executed cells.
So we can access one of the ggplot2
loaded datasets:
# create factors with value labels
mtcars$gear <- factor(mtcars$gear,levels=c(3,4,5),
labels=c("3gears","4gears","5gears"))
mtcars$am <- factor(mtcars$am,levels=c(0,1),
labels=c("Automatic","Manual"))
mtcars$cyl <- factor(mtcars$cyl,levels=c(4,6,8),
labels=c("4cyl","6cyl","8cyl"))
As with Rmd, if the last line of a code cell returns an object, we can display it. This includes things like charts:
# Kernel density plots for mpg
# grouped by number of gears (indicated by color)
g = qplot(mpg, data=mtcars, geom="density", fill=gear, alpha=I(.5),
main="Distribution of Gas Mileage", xlab="Miles Per Gallon",
ylab="Density")
g
As in any other R environment, we can change the aesthetics through the application of a particular theme.
For example, the theme_commonslib()
theme is a theme used by the House of Commons Library; we can install it from it's Github repository:
install.packages("remotes")
remotes::install_github("olihawkins/clplot")
And then simply add the theme to the chart:
library(ggplot2)
library(clplot)
g + theme_commonslib()
It's some time since I looked properly at how we could embed interactive elements in a notebook.
For example, it's easy enough to generate a widget:
#install.packages("leaflet")
library(leaflet)
m <- leaflet() %>%
addTiles() %>% # Add default OpenStreetMap map tiles
addMarkers(lng=174.768, lat=-36.852, popup="The birthplace of R")
But I'm not sure if we can directly embed the result yet?
This is one of my old workarounds: save the widget HTML to a file and then load it back in via an IFrame...
library(htmlwidgets)
library(IRdisplay)
saveWidget(m, 'demo.html', selfcontained = TRUE)
display_html('<iframe src="demo.html" width="100%" height=600></iframe>')
We can generate nicely styled tables with a bit of help...
install.packages("kableExtra")
Generate an HTML table and display it as such, via a slight hack...
library(knitr)
library(kableExtra)
kable(summary(cars)) %>%
kable_styling(bootstrap_options = c("striped", "hover")) %>%
as.character() %>% display_html()
This example is cribbed in part from https://medium.com/@traffordDataLab/querying-apis-in-r-39029b73d5f1 and uses the UK Police API, which returns JSON data of the form: https://data.police.uk/api/crimes-street/burglary?lat=52.0406&lng=-0.7594&date=2018-05
library(tidyverse)
library(httr)
library(jsonlite)
── Attaching packages ─────────────────────────────────────── tidyverse 1.2.1 ── ✔ tibble 2.1.1 ✔ purrr 0.3.2 ✔ tidyr 0.8.3 ✔ dplyr 0.8.0.1 ✔ readr 1.3.1 ✔ stringr 1.4.0 ✔ tibble 2.1.1 ✔ forcats 0.4.0 ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ── ✖ dplyr::filter() masks stats::filter() ✖ dplyr::group_rows() masks kableExtra::group_rows() ✖ dplyr::lag() masks stats::lag() Attaching package: ‘jsonlite’ The following object is masked from ‘package:purrr’: flatten
We can make a query onto the API and get a JSON file back:
path <- "https://data.police.uk/api/crimes-street/burglary?"
request <- GET(url = path,
query = list(
lat = 52.0406,
lng = -0.7594,
date = "2018-05")
)
Decoding the JSON returns everything as chars...:
response <- content(request, as = "text", encoding = "UTF-8")
df <- fromJSON(response, flatten = TRUE) %>%
data.frame()
head(df)
category | location_type | context | persistent_id | id | location_subtype | month | location.latitude | location.longitude | location.street.id | location.street.name | outcome_status.category | outcome_status.date | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
<chr> | <chr> | <chr> | <chr> | <int> | <chr> | <chr> | <chr> | <chr> | <int> | <chr> | <chr> | <chr> | |
1 | burglary | Force | fa6b515b3d9658626797bb6db06caefcab8d05e4ac387914c12d3077368b402d | 65373360 | 2018-05 | 52.052319 | -0.756575 | 1213725 | On or near Bryony Place | Investigation complete; no suspect identified | 2018-08 | ||
2 | burglary | Force | 7480cb4911243cc2f48669e71fa2fb7320f2197b6ccb6390be06281c898b5074 | 65379707 | 2018-05 | 52.044662 | -0.767213 | 1213735 | On or near Wandsworth Place | Investigation complete; no suspect identified | 2018-08 | ||
3 | burglary | Force | 6ffc8736dc157b33cf0d9f8d3a858ad1d4d41fc92331e34145be0ef465cf8eab | 65382710 | 2018-05 | 52.033215 | -0.746623 | 1213362 | On or near Trevone Court | Unable to prosecute suspect | 2018-08 | ||
4 | burglary | Force | 133441a73f4690a1d4e1873b7696fdd0240b1c330cc84337bf6243435d41159d | 65370281 | 2018-05 | 52.027977 | -0.760603 | 1212564 | On or near Rashleigh Place | Investigation complete; no suspect identified | 2018-08 | ||
5 | burglary | Force | 1e38aec62bd62383218f27bbbe41567c28d39900a5bb965e00b0a4e5e6db4844 | 65371736 | 2018-05 | 52.044649 | -0.758304 | 1213743 | On or near North Tenth Street | Unable to prosecute suspect | 2018-08 | ||
6 | burglary | Force | 04c25368c11760f02c45b701d21e1583fba2e31bb53ad4e1aa4c9dd55ba5c412 | 65377770 | 2018-05 | 52.030466 | -0.743479 | 1213353 | On or near Ashby | Investigation complete; no suspect identified | 2018-05 |
So I'm going to be lazy in how I cast the lat/long to a numeric and use a package to help:
#install.packages("hablar")
library(hablar)
The cast is then just a convert()
pipeline step:
df <- select(df,
month, category,
location = location.street.name,
long = location.longitude,
lat = location.latitude) %>%
convert(num(long, lat))
head(df)
month | category | location | long | lat |
---|---|---|---|---|
<chr> | <chr> | <chr> | <dbl> | <dbl> |
2018-05 | burglary | On or near Bryony Place | -0.756575 | 52.05232 |
2018-05 | burglary | On or near Wandsworth Place | -0.767213 | 52.04466 |
2018-05 | burglary | On or near Trevone Court | -0.746623 | 52.03321 |
2018-05 | burglary | On or near Rashleigh Place | -0.760603 | 52.02798 |
2018-05 | burglary | On or near North Tenth Street | -0.758304 | 52.04465 |
2018-05 | burglary | On or near Ashby | -0.743479 | 52.03047 |
m = leaflet(df) %>% addTiles() %>% addMarkers(
clusterOptions = markerClusterOptions(), popup = ~as.character(location)
)
saveWidget(m, 'demo.html', selfcontained = TRUE)
display_html('<iframe src="demo.html" width="100%" height=600></iframe>')
Assuming "long" and "lat" are longitude and latitude, respectively