Welcome to my very first Tidy Tuesday!
I hope you enjoy this journey from a very terrible graph to a beautiful plot. Maybe my notes and workflow help you learn a thing or two, I know I had to learn a few tricks today!
What I learned today that I didnt already know:
Manipulating the facet labels: this is one thing I’ve not done before.
I also got more familiar with mutate
in the tidyverse.
And playing around with adding text outside of the plot, such as customize labels and notes.
The github README and data can be found here
The Economist Article on “Greying of the Nobel laureates: Over the years, the committee has been bestowing the honour on older and older recipients” was published on Oct 3rd 2016.
My TidyTuesday goal is to recreate The Economists graph:
Lets load the data
library(tidyverse)
library(here)
library(janitor)
nobel_winners <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-05-14/nobel_winners.csv")
nobel_winner_all_pubs <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-05-14/nobel_winner_all_pubs.csv")
Since I want the x axis to be year awarded and the y axis to be the age of the laureate, I need to calculate age.
First stab at a plot
nobel_winners %>%
mutate(birth_year = substring(nobel_winners$birth_date, 1, 4), birth_year = as.integer(birth_year)) %>%
mutate(laureate_age = prize_year - birth_year) %>%
ggplot() +
geom_point(aes(x=prize_year, y=laureate_age))

Facet by Category
In the correct order that The Economist listed. And also add the smooth line.
require(gdata)
target <- c("Medicine", "Physics", "Chemistry","Economics","Literature","Peace")
nobel_winners %>%
mutate(birth_year = substring(nobel_winners$birth_date, 1, 4), birth_year = as.integer(birth_year)) %>%
mutate(laureate_age = prize_year - birth_year) %>%
mutate(category = reorder.factor(category, new.order=target)) %>%
ggplot() +
geom_point(aes(x=prize_year, y=laureate_age))+
facet_grid(.~category)+
geom_smooth(aes(x=prize_year, y=laureate_age),se=FALSE)

Im getting closer!
Now I just need to tidy up the axis
ggplot() +
geom_point(aes(x=prize_year, y=laureate_age))+
facet_grid(.~category)+
geom_smooth(aes(x=prize_year, y=laureate_age),se=FALSE)+
scale_y_continuous(position = "right", limits = c(15,100)) +
theme(panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(), axis.title.y=element_blank())

Adjusting the facet labels
To blend in with plot + changes to the x axis
theme(panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(),
axis.title.y=element_blank(),axis.title.x=element_blank(),
strip.background =element_rect(fill="grey92"),
strip.text.x = element_text(angle = 0, hjust = 0)) +
scale_x_continuous(limits = c(1900,2016), breaks = c(1900, 1950, 2000))

Colors Anyone?
I found this article on The Economist color schemes, but I still had to eyeball it for a couple, using this handy dandy color chart.
geom_point(aes(x=prize_year, y=laureate_age, colour = category), shape = 1)+
geom_smooth(aes(x=prize_year, y=laureate_age, colour = category),se=FALSE, lwd=1.5)+
theme(panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(),
axis.title.y=element_blank(),axis.title.x=element_blank(),
strip.background =element_rect(fill="grey92"),
strip.text.x = element_text(angle = 0, hjust = 0),
legend.position="none") ,
strip.text = element_text(colour = "grey25")) +
scale_colour_manual(values = c("#014d64","#90353B","#EE6A50","#2D6D66","#EE9A00","#01A2D9"))

Unfortanetly, ggplot does not easily change the facet text colors by category. I read a forum post on this if you want to really change the plotting, however I decided to take the hit and not do it.
What one could try to do is make the text “invisible” by making it the same color as the background, and they overlay individually the category text with the correct color.
Adding additional text
The tricky part….. adding text of the oldest and youngest winners with a little line to the point. hmmmmmm
theme(panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(),
axis.title.y=element_blank(),axis.title.x=element_blank(),
strip.background =element_rect(fill="grey92"),
strip.text.x = element_text(angle = 0, hjust = 0),
legend.position="none",
strip.text = element_text(colour = "grey24", face = 'bold'),
plot.title = element_text(face = 'bold', hjust = 0),
plot.caption = element_text(hjust = 0),
axis.text = element_text(face = 'bold')) +
labs(title = 'Senescience', subtitle ='Age of Nobel laureates, at date of award ', caption ='Source: Nobelprize.org ') +
annotate("text", x= Inf, y= 96, label = "Oldest winner \n Leonid Hurwicz, 90", hjust = 1, size = 2.5, colour = c("grey92","grey92","grey92","#2D6D66","grey92","grey92")) +
annotate("text", x= Inf, y= 25, label = "Youngest winner \n Malala Yousafzai, 17", hjust = 1, size = 2.5, colour = c("grey92","grey92","grey92","grey92","grey92","#01A2D9"))

I cant get all the text outside of the plot, buts its very close. Also the legend is a little tricky
Not bad, not bad
S.MARTINEZ 
THE ECONOMIST

Side note: our own favorite wheat hero, Norman E. Borlaug’s laureate_id
is 528
.
library(kableExtra)
nobel_winners %>%
filter(laureate_id == '528') %>%
kable(caption = "Table 1: Summary Norman E. Borlougs Nobel Peace Prize") %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = F)
Table 1: Summary Norman E. Borlougs Nobel Peace Prize
prize_year
|
category
|
prize
|
motivation
|
prize_share
|
laureate_id
|
laureate_type
|
full_name
|
birth_date
|
birth_city
|
birth_country
|
gender
|
organization_name
|
organization_city
|
organization_country
|
death_date
|
death_city
|
death_country
|
1970
|
Peace
|
The Nobel Peace Prize 1970
|
NA
|
1/1
|
528
|
Individual
|
Norman E. Borlaug
|
1914-03-25
|
Cresco, IA
|
United States of America
|
Male
|
NA
|
NA
|
NA
|
2009-09-12
|
Dallas, TX
|
United States of America
|
