Prepared by F. Sheeka.
This notebook aims at illustrating the Statistics Explained article on Digital economy and society statistics - households and individuals.
Figure 1: Internet access and broadband internet connections of households, EU-28, 2008-2018 (% of all households)
library(ggplot2)
library(tidyr)
library(repr)
library(dplyr)
library(devtools)
library(restatapi)
Attaching package: ‘dplyr’ The following objects are masked from ‘package:stats’: filter, lag The following objects are masked from ‘package:base’: intersect, setdiff, setequal, union Loading required package: usethis restatapi: - config file with the API version 1 loaded from GitHub (the 'current' API version number is 1). - 2 from the 4 cores are used for parallel computing. - 'libcurl' will be used for file download. - the Table of contents (TOC) was not pre-loaded into the deafult cache ('.restatapi_env').
I've called it dataset1_1 so we can differentiate from the other dataset we are pulling
dataset1_1 <- get_eurostat_data(id="isoc_ci_in_h",
filters = list(geo = "EU28", unit = "PC_HH", hhtyp = "TOTAL"),
date_filter = "2008:2018")
dataset1_1$indic_is <- "INT_ACCESS"
dataset1_2 <- get_eurostat_data(id="isoc_ci_it_h",
filters = list(geo = "EU28", unit = "PC_HH", hhtyp = "TOTAL", indic_is = "H_BROAD"),
date_filter = "2008:2018")
dataset_fig1 <- rbind(dataset1_1, dataset1_2)
dataset_fig1$time<-as.numeric(as.character(dataset_fig1$time))
dataset_fig1
unit | hhtyp | geo | time | values | indic_is |
---|---|---|---|---|---|
<fct> | <fct> | <fct> | <dbl> | <dbl> | <fct> |
PC_HH | TOTAL | EU28 | 2008 | 60 | INT_ACCESS |
PC_HH | TOTAL | EU28 | 2009 | 66 | INT_ACCESS |
PC_HH | TOTAL | EU28 | 2010 | 70 | INT_ACCESS |
PC_HH | TOTAL | EU28 | 2011 | 73 | INT_ACCESS |
PC_HH | TOTAL | EU28 | 2012 | 76 | INT_ACCESS |
PC_HH | TOTAL | EU28 | 2013 | 79 | INT_ACCESS |
PC_HH | TOTAL | EU28 | 2014 | 81 | INT_ACCESS |
PC_HH | TOTAL | EU28 | 2015 | 83 | INT_ACCESS |
PC_HH | TOTAL | EU28 | 2016 | 85 | INT_ACCESS |
PC_HH | TOTAL | EU28 | 2017 | 87 | INT_ACCESS |
PC_HH | TOTAL | EU28 | 2018 | 89 | INT_ACCESS |
PC_HH | TOTAL | EU28 | 2008 | 48 | H_BROAD |
PC_HH | TOTAL | EU28 | 2009 | 56 | H_BROAD |
PC_HH | TOTAL | EU28 | 2010 | 61 | H_BROAD |
PC_HH | TOTAL | EU28 | 2011 | 67 | H_BROAD |
PC_HH | TOTAL | EU28 | 2012 | 72 | H_BROAD |
PC_HH | TOTAL | EU28 | 2013 | 76 | H_BROAD |
PC_HH | TOTAL | EU28 | 2014 | 78 | H_BROAD |
PC_HH | TOTAL | EU28 | 2015 | 80 | H_BROAD |
PC_HH | TOTAL | EU28 | 2016 | 83 | H_BROAD |
PC_HH | TOTAL | EU28 | 2017 | 85 | H_BROAD |
PC_HH | TOTAL | EU28 | 2018 | 86 | H_BROAD |
options(repr.plot.width/height): changes the size of your graph.
ggplot2: the package that facilitates data visualisation.
data: the dataframe/dataset you would like ggplot to visualise.
aes: your axes!
geom_line(), geom_bar() and more: define what type of graph you would like to build. I am building a line graph, so I am using geom_line().
scale_colour_manual: controls the colours of your lines, the labels for your lines, and the name of the legend (which in my case, is blank).
ggtitle: the name of the graph.
scale_y/x_continuous: changes the scale of your x and y axes.
theme(text = element_text: changes the size of your font.
y/xlab: labels the x and y axes.
More information:
options(repr.plot.width=20, repr.plot.height=10)
ggplot(data=dataset_fig1, aes(x=time, y=values, group=factor(indic_is),
color=factor(indic_is))) +
geom_line() +
scale_color_manual(values = c("#F06423", "#276EB4"), labels = c("Internet access",
"Broadband connection"), , name = " ") +
ggtitle("Internet access and broadband internet connections of households, EU-28, 2008-2018 (% of all households)") +
scale_y_continuous(limits = c(0, 100), breaks = seq(0, 180, by = 25)) +
scale_x_continuous(limits = c(2008, 2018), breaks = seq(2008, 2018, by = 1)) +
theme(text = element_text(size = 20)) +
ylab(" ") +
xlab(" ")