Welcome to my very first Tidy Tuesday!
I hope you enjoy this journey from a very terrible graph to a beautiful plot. Maybe my notes and workflow help you learn a thing or two, I know I had to learn a few tricks today!
What I learned today that I didnt already know:
Manipulating the facet labels: this is one thing I’ve not done before.
I also got more familiar with mutate
in the tidyverse.
And playing around with adding text outside of the plot, such as customize labels and notes.
The github README and data can be found here
The Economist Article on “Greying of the Nobel laureates: Over the years, the committee has been bestowing the honour on older and older recipients” was published on Oct 3rd 2016.
My TidyTuesday goal is to recreate The Economists graph:
Lets load the data
library(tidyverse)
library(here)
library(janitor)
nobel_winners <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-05-14/nobel_winners.csv")
nobel_winner_all_pubs <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-05-14/nobel_winner_all_pubs.csv")
Since I want the x axis to be year awarded and the y axis to be the age of the laureate, I need to calculate age.
First stab at a plot
nobel_winners %>%
mutate(birth_year = substring(nobel_winners$birth_date, 1, 4), birth_year = as.integer(birth_year)) %>%
mutate(laureate_age = prize_year - birth_year) %>%
ggplot() +
geom_point(aes(x=prize_year, y=laureate_age))
Facet by Category
In the correct order that The Economist listed. And also add the smooth line.
require(gdata)
target <- c("Medicine", "Physics", "Chemistry","Economics","Literature","Peace") #Order defined
nobel_winners %>%
mutate(birth_year = substring(nobel_winners$birth_date, 1, 4), birth_year = as.integer(birth_year)) %>%
mutate(laureate_age = prize_year - birth_year) %>%
mutate(category = reorder.factor(category, new.order=target)) %>% #reorder by defined order
ggplot() +
geom_point(aes(x=prize_year, y=laureate_age))+
facet_grid(.~category)+
geom_smooth(aes(x=prize_year, y=laureate_age),se=FALSE)
Im getting closer!
Now I just need to tidy up the axis
ggplot() +
geom_point(aes(x=prize_year, y=laureate_age))+
facet_grid(.~category)+
geom_smooth(aes(x=prize_year, y=laureate_age),se=FALSE)+
scale_y_continuous(position = "right", limits = c(15,100)) + #update y axis
theme(panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(), axis.title.y=element_blank()) #edit grid background and rm y axis label
Adjusting the facet labels
To blend in with plot + changes to the x axis
theme(panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(),
axis.title.y=element_blank(),axis.title.x=element_blank(),
strip.background =element_rect(fill="grey92"), #change facet label background color
strip.text.x = element_text(angle = 0, hjust = 0)) + #move labelled text to the left
scale_x_continuous(limits = c(1900,2016), breaks = c(1900, 1950, 2000)) #adjust x axis increments
Colors Anyone?
I found this article on The Economist color schemes, but I still had to eyeball it for a couple, using this handy dandy color chart.
geom_point(aes(x=prize_year, y=laureate_age, colour = category), shape = 1)+ #add color and hollow point
geom_smooth(aes(x=prize_year, y=laureate_age, colour = category),se=FALSE, lwd=1.5)+ #add color and thicken line
theme(panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(),
axis.title.y=element_blank(),axis.title.x=element_blank(),
strip.background =element_rect(fill="grey92"),
strip.text.x = element_text(angle = 0, hjust = 0),
legend.position="none") , #remove legend
strip.text = element_text(colour = "grey25")) + # change facet text color
scale_colour_manual(values = c("#014d64","#90353B","#EE6A50","#2D6D66","#EE9A00","#01A2D9")) #manual color
Unfortanetly, ggplot does not easily change the facet text colors by category. I read a forum post on this if you want to really change the plotting, however I decided to take the hit and not do it.
What one could try to do is make the text “invisible” by making it the same color as the background, and they overlay individually the category text with the correct color.
Adding additional text
The tricky part….. adding text of the oldest and youngest winners with a little line to the point. hmmmmmm
theme(panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(),
axis.title.y=element_blank(),axis.title.x=element_blank(),
strip.background =element_rect(fill="grey92"),
strip.text.x = element_text(angle = 0, hjust = 0),
legend.position="none",
strip.text = element_text(colour = "grey24", face = 'bold'), #bold facet labels
plot.title = element_text(face = 'bold', hjust = 0), #change title to left align
plot.caption = element_text(hjust = 0), #change caption to left align
axis.text = element_text(face = 'bold')) +
labs(title = 'Senescience', subtitle ='Age of Nobel laureates, at date of award ', caption ='Source: Nobelprize.org ') + #add labels
annotate("text", x= Inf, y= 96, label = "Oldest winner \n Leonid Hurwicz, 90", hjust = 1, size = 2.5, colour = c("grey92","grey92","grey92","#2D6D66","grey92","grey92")) + #add Oldest
annotate("text", x= Inf, y= 25, label = "Youngest winner \n Malala Yousafzai, 17", hjust = 1, size = 2.5, colour = c("grey92","grey92","grey92","grey92","grey92","#01A2D9")) #add youngest
I cant get all the text outside of the plot, buts its very close. Also the legend is a little tricky
Not bad, not bad
S.MARTINEZ
THE ECONOMIST
Side note: our own favorite wheat hero, Norman E. Borlaug’s laureate_id
is 528
.
library(kableExtra)
nobel_winners %>%
filter(laureate_id == '528') %>%
kable(caption = "Table 1: Summary Norman E. Borlougs Nobel Peace Prize") %>%
kable_styling(bootstrap_options = c("striped", "hover"), full_width = F)
Table 1: Summary Norman E. Borlougs Nobel Peace Prize
prize_year
|
category
|
prize
|
motivation
|
prize_share
|
laureate_id
|
laureate_type
|
full_name
|
birth_date
|
birth_city
|
birth_country
|
gender
|
organization_name
|
organization_city
|
organization_country
|
death_date
|
death_city
|
death_country
|
1970
|
Peace
|
The Nobel Peace Prize 1970
|
NA
|
1/1
|
528
|
Individual
|
Norman E. Borlaug
|
1914-03-25
|
Cresco, IA
|
United States of America
|
Male
|
NA
|
NA
|
NA
|
2009-09-12
|
Dallas, TX
|
United States of America
|
---
title: "TidyTuesdayWk20"
output:
  html_document:
    code_download: yes
    toc: true
    toc_depth: 5
    toc_float: true
    theme: flatly
  pdf_document: default
date: "`r format(Sys.time(), '%Y.%m.%d')`"
author: S.A. Martinez

---

**Welcome to my very first Tidy Tuesday!**   

I hope you enjoy this journey from a very terrible graph to a beautiful plot. Maybe my notes and workflow help you learn a thing or two, I know I had to learn a few tricks today!

**What I learned today that I didnt already know:**  

> Manipulating the facet labels: this is one thing I've not done before.  
> I also got more familiar with `mutate` in the tidyverse.  
> And playing around with adding text outside of the plot, such as customize labels and notes.  

-----

The github [README](https://github.com/rfordatascience/tidytuesday/blob/master/data/2019/2019-05-14/readme.md) and data can be found [here](https://github.com/rfordatascience/tidytuesday/tree/master/data/2019/2019-05-14)  

The [Economist Article](https://www.economist.com/graphic-detail/2016/10/03/greying-of-the-nobel-laureates) on "Greying of the Nobel laureates: Over the years, the committee has been bestowing the honour on older and older recipients" was published on Oct 3rd 2016.  

**My TidyTuesday goal is to recreate The Economists graph:**   

![](TidyTuesdayWk20_EconomistPlot.png)

-------

### Lets load the data  
```{r  message=FALSE, warning=FALSE }
library(tidyverse)
library(here)
library(janitor)

nobel_winners <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-05-14/nobel_winners.csv")
nobel_winner_all_pubs <- readr::read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-05-14/nobel_winner_all_pubs.csv")
```

Since I want the x axis to be year awarded and the y axis to be the age of the laureate, I need to calculate age.

### First stab at a plot 
```{r warning=FALSE}
nobel_winners %>% 
  mutate(birth_year = substring(nobel_winners$birth_date, 1, 4), birth_year = as.integer(birth_year)) %>%
  mutate(laureate_age = prize_year - birth_year) %>%
  ggplot() +
  geom_point(aes(x=prize_year, y=laureate_age))

```

### Facet by `Category` 
In the correct order that The Economist listed. And also add the smooth line.  

```{r  eval=FALSE}
require(gdata)
target <- c("Medicine", "Physics", "Chemistry","Economics","Literature","Peace") #Order defined

nobel_winners %>% 
  mutate(birth_year = substring(nobel_winners$birth_date, 1, 4), birth_year = as.integer(birth_year)) %>%
  mutate(laureate_age = prize_year - birth_year) %>%
  mutate(category = reorder.factor(category, new.order=target)) %>%  #reorder by defined order
  ggplot() +
    geom_point(aes(x=prize_year, y=laureate_age))+
    facet_grid(.~category)+
    geom_smooth(aes(x=prize_year, y=laureate_age),se=FALSE)
```

```{r  echo=FALSE, warning=FALSE, message=FALSE}
require(gdata)
target <- c("Medicine", "Physics", "Chemistry","Economics","Literature","Peace") #Order defined

nobel_winners %>% 
  mutate(birth_year = substring(nobel_winners$birth_date, 1, 4), birth_year = as.integer(birth_year)) %>%
  mutate(laureate_age = prize_year - birth_year) %>%
  mutate(category = reorder.factor(category, new.order=target)) %>%  #reorder by defined order
  ggplot() +
    geom_point(aes(x=prize_year, y=laureate_age))+
    facet_grid(.~category)+
    geom_smooth(aes(x=prize_year, y=laureate_age),se=FALSE)
```

### Im getting closer!   

Now I just need to tidy up the axis
```{r eval = FALSE}
  ggplot() +
    geom_point(aes(x=prize_year, y=laureate_age))+
    facet_grid(.~category)+
    geom_smooth(aes(x=prize_year, y=laureate_age),se=FALSE)+
    scale_y_continuous(position = "right", limits = c(15,100)) +   #update y axis 
    theme(panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(), axis.title.y=element_blank()) #edit grid background and rm y axis label
```

```{r echo = FALSE, warning=FALSE, message=FALSE}
require(gdata)
target <- c("Medicine", "Physics", "Chemistry","Economics","Literature","Peace") 

nobel_winners %>% 
  mutate(birth_year = substring(nobel_winners$birth_date, 1, 4), birth_year = as.integer(birth_year)) %>%
  mutate(laureate_age = prize_year - birth_year) %>%
  mutate(category = reorder.factor(category, new.order=target)) %>%  
  ggplot() +
    geom_point(aes(x=prize_year, y=laureate_age))+
    facet_grid(.~category)+
    geom_smooth(aes(x=prize_year, y=laureate_age),se=FALSE)+
    scale_y_continuous(position = "right", limits = c(15,100)) +   #update y axis 
    theme(panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(), axis.title.y=element_blank()) #edit grid background and rm y axis label
```


### Adjusting the facet labels
To blend in with plot + changes to the x axis 
```{r eval = FALSE}
    theme(panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(),
          axis.title.y=element_blank(),axis.title.x=element_blank(),
          strip.background =element_rect(fill="grey92"), #change facet label background color 
          strip.text.x = element_text(angle = 0, hjust = 0)) + #move labelled text to the left
    scale_x_continuous(limits = c(1900,2016), breaks = c(1900, 1950, 2000)) #adjust x axis increments 
```

```{r echo = FALSE, warning=FALSE, message=FALSE}
require(gdata)
target <- c("Medicine", "Physics", "Chemistry","Economics","Literature","Peace") 

nobel_winners %>% 
  mutate(birth_year = substring(nobel_winners$birth_date, 1, 4), birth_year = as.integer(birth_year)) %>%
  mutate(laureate_age = prize_year - birth_year) %>%
  mutate(category = reorder.factor(category, new.order=target)) %>%  
  ggplot() +
    geom_point(aes(x=prize_year, y=laureate_age))+
    facet_grid(.~category)+
    geom_smooth(aes(x=prize_year, y=laureate_age),se=FALSE)+
    scale_y_continuous(position = "right", limits = c(15,100)) +  
    theme(panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(),
          axis.title.y=element_blank(),axis.title.x=element_blank(),
          strip.background =element_rect(fill="grey92"), #change facet label background color 
          strip.text.x = element_text(angle = 0, hjust = 0)) + #move labelled text to the left
    scale_x_continuous(limits = c(1900,2016), breaks = c(1900, 1950, 2000)) #adjust x axis increments 
```

-----

### Colors Anyone?     
I found this [article](https://yutannihilation.github.io/allYourFigureAreBelongToUs/ggthemes/economist_pal/) on The Economist color schemes, but I still had to eyeball it for a couple, using this handy dandy [color chart](http://research.stowers.org/mcm/efg/R/Color/Chart/ColorChart.pdf).  

```{r eval= FALSE, warning=FALSE, message=FALSE}
    geom_point(aes(x=prize_year, y=laureate_age, colour = category), shape = 1)+  #add color and hollow point 
    geom_smooth(aes(x=prize_year, y=laureate_age, colour = category),se=FALSE, lwd=1.5)+ #add color and thicken line
    theme(panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(),
          axis.title.y=element_blank(),axis.title.x=element_blank(),
          strip.background =element_rect(fill="grey92"), 
          strip.text.x = element_text(angle = 0, hjust = 0),
          legend.position="none") , #remove legend
          strip.text = element_text(colour = "grey25")) + # change facet text color
    scale_colour_manual(values = c("#014d64","#90353B","#EE6A50","#2D6D66","#EE9A00","#01A2D9")) #manual color
```

```{r echo  = FALSE, warning=FALSE, message=FALSE}
require(gdata)
target <- c("Medicine", "Physics", "Chemistry","Economics","Literature","Peace") 

library(extrafont)
nobel_winners %>% 
  mutate(birth_year = substring(nobel_winners$birth_date, 1, 4), birth_year = as.integer(birth_year)) %>%
  mutate(laureate_age = prize_year - birth_year) %>%
  mutate(category = reorder.factor(category, new.order=target)) %>%  
  ggplot() +
    geom_point(aes(x=prize_year, y=laureate_age, colour = category), shape = 1)+  #add color and hollow point 
    facet_grid(.~category)+
    geom_smooth(aes(x=prize_year, y=laureate_age, colour = category),se=FALSE, lwd=1.5)+ #add color and thicken line
    scale_y_continuous(position = "right", limits = c(15,100)) +  
    theme(panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(),
          axis.title.y=element_blank(),axis.title.x=element_blank(),
          strip.background =element_rect(fill="grey92"), 
          strip.text.x = element_text(angle = 0, hjust = 0),
          legend.position="none", #remove legend
          strip.text = element_text(colour = "grey25")) + # change facet text color
    scale_x_continuous(limits = c(1900,2016), breaks = c(1900, 1950, 2000))+ 
    scale_colour_manual(values = c("#014d64","#90353B","#EE6A50","#2D6D66","#EE9A00","#01A2D9")) 
```

Unfortanetly, ggplot does not easily change the facet text colors by category. I read a [forum post](https://github.com/tidyverse/ggplot2/issues/2096) on this if you want to really change the plotting, however I decided to take the hit and not do it.  

What one could try to do is make the text "invisible" by making it the same color as the background, and they overlay individually the category text with the correct color.  

### Adding additional text  
The tricky part..... adding text of the oldest and youngest winners with a little line to the point. hmmmmmm

```{r eval= FALSE, warning=FALSE, message=FALSE}
    theme(panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(),
          axis.title.y=element_blank(),axis.title.x=element_blank(),
          strip.background =element_rect(fill="grey92"), 
          strip.text.x = element_text(angle = 0, hjust = 0),
          legend.position="none", 
          strip.text = element_text(colour = "grey24", face = 'bold'), #bold facet labels
          plot.title = element_text(face = 'bold', hjust = 0),  #change title to left align
          plot.caption = element_text(hjust = 0), #change caption to left align
          axis.text =  element_text(face = 'bold')) +
    labs(title = 'Senescience',  subtitle ='Age of Nobel laureates, at date of award ', caption ='Source: Nobelprize.org ') + #add labels
    annotate("text", x= Inf, y= 96, label = "Oldest winner \n Leonid Hurwicz, 90", hjust = 1, size = 2.5, colour = c("grey92","grey92","grey92","#2D6D66","grey92","grey92")) + #add Oldest
    annotate("text", x= Inf, y= 25, label = "Youngest winner \n Malala Yousafzai, 17", hjust = 1, size = 2.5, colour = c("grey92","grey92","grey92","grey92","grey92","#01A2D9")) #add youngest
  
```

```{r echo  = FALSE, warning=FALSE, message=FALSE}
require(gdata)
target <- c("Medicine", "Physics", "Chemistry","Economics","Literature","Peace") 

library(extrafont)

nobel_winners %>% 
  mutate(birth_year = substring(nobel_winners$birth_date, 1, 4), birth_year = as.integer(birth_year)) %>%
  mutate(laureate_age = prize_year - birth_year) %>%
  mutate(category = reorder.factor(category, new.order=target)) %>%  
  ggplot() +
    geom_point(aes(x=prize_year, y=laureate_age, colour = category), shape = 1)+  
    facet_grid(.~category)+
    geom_smooth(aes(x=prize_year, y=laureate_age, colour = category),se=FALSE, lwd=1.5)+ 
    scale_y_continuous(position = "right", limits = c(15,100)) +  
    theme(panel.grid.major.x = element_blank(), panel.grid.minor = element_blank(),
          axis.title.y=element_blank(),axis.title.x=element_blank(),
          strip.background =element_rect(fill="grey92"), 
          strip.text.x = element_text(angle = 0, hjust = 0),
          legend.position="none", 
          strip.text = element_text(colour = "grey24", face = 'bold'), #bold facet labels
          plot.title = element_text(face = 'bold', hjust = 0),  #change title to left align
          plot.caption = element_text(hjust = 0), #change caption to left align
          axis.text =  element_text(face = 'bold')) + 
    scale_x_continuous(limits = c(1900,2016), breaks = c(1900, 1950, 2000))+ 
    scale_colour_manual(values = c("#014d64","#90353B","#EE6A50","#2D6D66","#EE9A00","#01A2D9")) +
  labs(title = 'Senescience',  subtitle ='Age of Nobel laureates, at date of award ', caption ='Source: Nobelprize.org ') + #add labels
  annotate("text", x= Inf, y= 96, label = "Oldest winner \n Leonid Hurwicz, 90", hjust = 1, size = 2.5, colour = c("grey92","grey92","grey92","#2D6D66","grey92","grey92")) + #add Oldest
  annotate("text", x= Inf, y= 25, label = "Youngest winner \n Malala Yousafzai, 17", hjust = 1, size = 2.5, colour = c("grey92","grey92","grey92","grey92","grey92","#01A2D9"))+ #add youngest
  ggsave("TidyTuesdayWk20_Plot.png")
   
```


I cant get all the text outside of the plot, buts its very close. Also the legend is a little tricky  

-----

### Not bad, not bad   

**S.MARTINEZ**
![](TidyTuesdayWk20_Plot.png) 

**THE ECONOMIST**  
![](TidyTuesdayWk20_EconomistPlot.png) 


-------

Side note: our own favorite wheat hero, Norman E. Borlaug's `laureate_id` is `528`.  

```{r  message=FALSE, warning=FALSE }
library(kableExtra)
nobel_winners %>% 
  filter(laureate_id == '528') %>%
  kable(caption = "Table 1: Summary Norman E. Borlougs Nobel Peace Prize") %>%
  kable_styling(bootstrap_options = c("striped", "hover"), full_width = F) 
```