Some of the data provided in this factbook is as follows:
%%capture
%load_ext sql
%sql sqlite:///factbook.db
'Connected: None@factbook.db'
%%sql
SELECT *
FROM sqlite_master
WHERE type='table';
Done.
type | name | tbl_name | rootpage | sql |
---|---|---|---|---|
table | sqlite_sequence | sqlite_sequence | 3 | CREATE TABLE sqlite_sequence(name,seq) |
table | facts | facts | 47 | CREATE TABLE "facts" ("id" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, "code" varchar(255) NOT NULL, "name" varchar(255) NOT NULL, "area" integer, "area_land" integer, "area_water" integer, "population" integer, "population_growth" float, "birth_rate" float, "death_rate" float, "migration_rate" float) |
%%sql
SELECT *
FROM facts
LIMIT 5;
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
1 | af | Afghanistan | 652230 | 652230 | 0 | 32564342 | 2.32 | 38.57 | 13.89 | 1.51 |
2 | al | Albania | 28748 | 27398 | 1350 | 3029278 | 0.3 | 12.92 | 6.58 | 3.3 |
3 | ag | Algeria | 2381741 | 2381741 | 0 | 39542166 | 1.84 | 23.67 | 4.31 | 0.92 |
4 | an | Andorra | 468 | 468 | 0 | 85580 | 0.12 | 8.13 | 6.96 | 0.0 |
5 | ao | Angola | 1246700 | 1246700 | 0 | 19625353 | 2.78 | 38.78 | 11.49 | 0.46 |
%%sql
SELECT MIN(population) AS min_pop,
MAX(population) AS max_pop,
MIN(population_growth) AS min_pop_growth,
MAX(population_growth) AS max_pop_growth
FROM facts;
Done.
min_pop | max_pop | min_pop_growth | max_pop_growth |
---|---|---|---|
0 | 7256490011 | 0.0 | 4.02 |
%%sql
SELECT name, population
FROM facts
WHERE population IN (SELECT MIN(population) FROM facts);
Done.
name | population |
---|---|
Antarctica | 0 |
%%sql
SELECT name, population
FROM facts
WHERE population IN (SELECT MAX(population) FROM facts);
Done.
name | population |
---|---|
World | 7256490011 |
We can see that the data includes a row for the whole World called 'World' and that Antarctica has a population of 0 which are both skewing our country population data. We'll rerun the same stats without these rows below.
%%sql
SELECT MIN(population) AS min_pop,
MAX(population) AS max_pop,
MIN(population_growth) AS min_pop_growth,
MAX(population_growth) AS max_pop_growth
FROM facts
WHERE name <> 'World'
AND name <> 'Antarctica';
Done.
min_pop | max_pop | min_pop_growth | max_pop_growth |
---|---|---|---|
48 | 1367485388 | 0.0 | 4.02 |
%%sql
SELECT name, population
FROM facts
WHERE population IN (SELECT MIN(population) FROM facts
WHERE name <> 'World'
AND name <> 'Antarctica');
Done.
name | population |
---|---|
Pitcairn Islands | 48 |
From the above query, we can see that the British territory of Pitcairn Islands has the lowest population of any territory with 48 people, but it is technically not a country.
%%sql
SELECT name, population
FROM facts
WHERE population IN (SELECT MIN(population) FROM facts
WHERE name <> 'World'
AND name <> 'Antarctica'
AND name NOT LIKE '% Islands'
AND name NOT LIKE '% City)');
Done.
name | population |
---|---|
Niue | 1190 |
In order to exclude Island territories and cities, we excluded them in our SQL query filter and returned Niue as the Country with the lowest population: 1190.
%%sql
SELECT ROUND(AVG(population), 2) AS avg_pop,
ROUND(AVG(area), 2) AS avg_area
FROM facts;
Done.
avg_pop | avg_area |
---|---|
62094928.32 | 555093.55 |
%%sql
SELECT *
FROM facts
WHERE population > (SELECT ROUND(AVG(population), 2)
FROM facts
WHERE name <> 'World'
AND name <> 'Antarctica')
AND area < (SELECT ROUND(AVG(area), 2)
FROM facts
WHERE name <> 'World'
AND name <> 'Antarctica')
ORDER BY population DESC;
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
14 | bg | Bangladesh | 148460 | 130170 | 18290 | 168957745 | 1.6 | 21.14 | 5.61 | 0.46 |
85 | ja | Japan | 377915 | 364485 | 13430 | 126919659 | 0.16 | 7.93 | 9.51 | 0.0 |
138 | rp | Philippines | 300000 | 298170 | 1830 | 100998376 | 1.61 | 24.27 | 6.11 | 2.09 |
192 | vm | Vietnam | 331210 | 310070 | 21140 | 94348835 | 0.97 | 15.96 | 5.93 | 0.3 |
65 | gm | Germany | 357022 | 348672 | 8350 | 80854408 | 0.17 | 8.47 | 11.42 | 1.24 |
173 | th | Thailand | 513120 | 510890 | 2230 | 67976405 | 0.34 | 11.19 | 7.8 | 0.0 |
185 | uk | United Kingdom | 243610 | 241930 | 1680 | 64088222 | 0.54 | 12.17 | 9.35 | 2.54 |
83 | it | Italy | 301340 | 294140 | 7200 | 61855120 | 0.27 | 8.74 | 10.19 | 4.1 |
91 | ks | Korea, South | 99720 | 96920 | 2800 | 49115196 | 0.14 | 8.19 | 6.75 | 0.0 |
163 | sp | Spain | 505370 | 498980 | 6390 | 48146134 | 0.89 | 9.64 | 9.04 | 8.31 |
139 | pl | Poland | 312685 | 304255 | 8430 | 38562189 | 0.09 | 9.74 | 10.19 | 0.46 |
182 | ug | Uganda | 241038 | 197100 | 43938 | 37101745 | 3.24 | 43.79 | 10.69 | 0.74 |
80 | iz | Iraq | 438317 | 437367 | 950 | 37056169 | 2.93 | 31.45 | 3.77 | 1.62 |
120 | mo | Morocco | 446550 | 446300 | 250 | 33322699 | 1.0 | 18.2 | 4.81 | 3.36 |
Which country has the most people?
%%sql
SELECT *
FROM facts
Where name <> 'World'
AND name <> 'Antarctica'
AND name NOT LIKE '% Islands'
AND name NOT LIKE '% City)'
ORDER BY population DESC
LIMIT 1;
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
37 | ch | China | 9596960 | 9326410 | 270550 | 1367485388 | 0.45 | 12.49 | 7.53 | 0.44 |
Which country has the highest growth rate?
%%sql
SELECT *
FROM facts
WHERE name <> 'World'
AND name <> 'Antarctica'
ORDER BY population_growth DESC
LIMIT 1;
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
162 | od | South Sudan | 644329 | None | None | 12042910 | 4.02 | 36.91 | 8.18 | 11.47 |
South Sudan has the highest growth rate at 4.02%.
Let's see which country has the lowest growth rate below.
Which country has the lowest growth rate?
%%sql
SELECT *
FROM facts
WHERE name <> 'World'
AND name <> 'Antarctica'
AND name NOT LIKE '% Islands'
AND name NOT LIKE '% City)'
ORDER BY population_growth
LIMIT 1;
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
92 | kv | Kosovo | 10887 | 10887 | 0 | 1870981 | None | None | None | None |
From the above query, Kosovo has the lowest growth rate but that seems to be because it is missing data. Let's run this again fitering out countries with 'None' values.
%%sql
SELECT *
FROM facts
WHERE name <> 'World'
AND name <> 'Antarctica'
AND name NOT LIKE '% Islands'
AND name NOT LIKE '% City)'
AND population_growth IS NOT NULL
ORDER BY population_growth
LIMIT 1;
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
207 | gl | Greenland | 2166086 | 2166086 | None | 57733 | 0.0 | 14.48 | 8.49 | 5.98 |
Now we can see that Greenland has the lowest population growth which is actually 0.0 and not NULL. After a quick Google search on Greenland, we found that it is a territory of Denmark, so again not technically an autonomous country. Let's exclude it from the query and see if we can find the lowest growth rate for autonomous countries.
%%sql
SELECT *
FROM facts
WHERE name <> 'World'
AND name <> 'Antarctica'
AND name <> 'Greenland'
AND name NOT LIKE '% Islands'
AND name NOT LIKE '% City)'
AND population_growth IS NOT NULL
ORDER BY population_growth
LIMIT 1;
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
67 | gr | Greece | 131957 | 130647 | 1310 | 10775643 | 0.01 | 8.66 | 11.09 | 2.32 |
Finally we've found the country with the lowest population growth rate, which is 0.01: Greece.
Which countries have the highest ratios of water to land?
%%sql
SELECT *, ROUND((area_water / area_land)*100, 2) AS water_land_ratio
FROM facts
WHERE name <> 'World'
AND name <> 'Antarctica'
AND name <> 'Greenland'
AND name NOT LIKE '% Islands'
AND name NOT LIKE '% City)'
AND name NOT LIKE '% territory'
AND area_land IS NOT NULL
AND area_water IS NOT NULL
ORDER BY water_land_ratio DESC
LIMIT 20;
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate | water_land_ratio |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | af | Afghanistan | 652230 | 652230 | 0 | 32564342 | 2.32 | 38.57 | 13.89 | 1.51 | 0.0 |
2 | al | Albania | 28748 | 27398 | 1350 | 3029278 | 0.3 | 12.92 | 6.58 | 3.3 | 0.0 |
3 | ag | Algeria | 2381741 | 2381741 | 0 | 39542166 | 1.84 | 23.67 | 4.31 | 0.92 | 0.0 |
4 | an | Andorra | 468 | 468 | 0 | 85580 | 0.12 | 8.13 | 6.96 | 0.0 | 0.0 |
5 | ao | Angola | 1246700 | 1246700 | 0 | 19625353 | 2.78 | 38.78 | 11.49 | 0.46 | 0.0 |
6 | ac | Antigua and Barbuda | 442 | 442 | 0 | 92436 | 1.24 | 15.85 | 5.69 | 2.21 | 0.0 |
7 | ar | Argentina | 2780400 | 2736690 | 43710 | 43431886 | 0.93 | 16.64 | 7.33 | 0.0 | 0.0 |
8 | am | Armenia | 29743 | 28203 | 1540 | 3056382 | 0.15 | 13.61 | 9.34 | 5.8 | 0.0 |
9 | as | Australia | 7741220 | 7682300 | 58920 | 22751014 | 1.07 | 12.15 | 7.14 | 5.65 | 0.0 |
10 | au | Austria | 83871 | 82445 | 1426 | 8665550 | 0.55 | 9.41 | 9.42 | 5.56 | 0.0 |
11 | aj | Azerbaijan | 86600 | 82629 | 3971 | 9780780 | 0.96 | 16.64 | 7.07 | 0.0 | 0.0 |
12 | bf | Bahamas, The | 13880 | 10010 | 3870 | 324597 | 0.85 | 15.5 | 7.05 | 0.0 | 0.0 |
13 | ba | Bahrain | 760 | 760 | 0 | 1346613 | 2.41 | 13.66 | 2.69 | 13.09 | 0.0 |
14 | bg | Bangladesh | 148460 | 130170 | 18290 | 168957745 | 1.6 | 21.14 | 5.61 | 0.46 | 0.0 |
15 | bb | Barbados | 430 | 430 | 0 | 290604 | 0.31 | 11.87 | 8.44 | 0.3 | 0.0 |
16 | bo | Belarus | 207600 | 202900 | 4700 | 9589689 | 0.2 | 10.7 | 13.36 | 0.7 | 0.0 |
17 | be | Belgium | 30528 | 30278 | 250 | 11323973 | 0.76 | 11.41 | 9.63 | 5.87 | 0.0 |
18 | bh | Belize | 22966 | 22806 | 160 | 347369 | 1.87 | 24.68 | 5.97 | 0.0 | 0.0 |
19 | bn | Benin | 112622 | 110622 | 2000 | 10448647 | 2.78 | 36.02 | 8.21 | 0.0 | 0.0 |
20 | bt | Bhutan | 38394 | 38394 | 0 | 741919 | 1.11 | 17.78 | 6.69 | 0.0 | 0.0 |
Which countries have more water than land?
Looking at the results of our query above, it seems that there is something wrong with the data that is giving nearly every country a 0.0 ratio of water-to-land. We can see here -> https://en.wikipedia.org/wiki/Talk%3AList_of_countries_by_percentage_of_water_area that we're not the only ones to notice this data is incorrect. Due to this, we will leave these findings out of our summary.
Which countries will add the most people to their populations next year?
%%sql
SELECT name, ROUND(birth_rate - death_rate, 2) AS birth_ratio
FROM facts
WHERE name <> 'World'
AND name <> 'Antarctica'
AND name <> 'Greenland'
AND name NOT LIKE '% Islands'
AND name NOT LIKE '% City)'
AND name NOT LIKE '% territory'
ORDER BY birth_ratio DESC
LIMIT 3;
Done.
name | birth_ratio |
---|---|
Malawi | 33.15 |
Uganda | 33.1 |
Niger | 33.03 |
From the above query, we can see that the top 3 countries which will add the most people to their population next year according to birth ratio are:
All of which happen to be countries in the African Continent (where it's very hot climate)
Which countries have a higher death rate than birth rate?
%%sql
SELECT name, ROUND(death_rate - birth_rate, 2) AS death_ratio
FROM facts
WHERE name <> 'World'
AND name <> 'Antarctica'
AND name <> 'Greenland'
AND name NOT LIKE '% Islands'
AND name NOT LIKE '% City)'
AND name NOT LIKE '% territory'
ORDER BY death_ratio DESC
LIMIT 3;
Done.
name | death_ratio |
---|---|
Bulgaria | 5.52 |
Serbia | 4.58 |
Latvia | 4.31 |
In the above query, we can see the top 3 countries which have higher death to birth ratios:
All 3 countries happen to be in Eastern Europe (where it's extremely cold)
Which countries have the highest population/area ratio, and how does it compare to list we found in the previous screen?
%%sql
SELECT name, ROUND(population / area, 2) AS pop_area_ratio
FROM facts
WHERE name <> 'World'
AND name <> 'Antarctica'
AND name <> 'Greenland'
AND name NOT LIKE '% Islands'
AND name NOT LIKE '% City)'
AND name NOT LIKE '% territory'
ORDER BY pop_area_ratio DESC
LIMIT 3;
Done.
name | pop_area_ratio |
---|---|
Macau | 21168.0 |
Monaco | 15267.0 |
Singapore | 8141.0 |
In the above query, we can see that the top 3 countries with the highest population/area ratio are:
Interestingly, these 3 countries are well developed and known to be some of the richest/most expensive countries and are all relatively small.