In this project, worked with data from the CIA World Factbook, a compendium of statistics about all of the countries on Earth. The Factbook contains demographic information for countries like: population - The population as of 2015. population_growth - The annual population growth rate, as a percentage. area - The total land and water area.
%%!
pip install --trusted-host pypi.org ipython-sql
# Instructions
# Copy the text above into a code cell in the Jupyter Notebook on the right.
# Using the menu bar or keyboard shortcut, run the cell.
# You should see the output: 'Connected: None@factbook.db'
%%capture
%load_ext sql
%sql sqlite:///factbook.db
The code above imports the factbook database
# Write a query to return information on the tables in the database.
%%sql
SELECT * FROM sqlite_master WHERE type='table';
* sqlite:///factbook.db Done.
type | name | tbl_name | rootpage | sql |
---|---|---|---|---|
table | sqlite_sequence | sqlite_sequence | 3 | CREATE TABLE sqlite_sequence(name,seq) |
table | facts | facts | 47 | CREATE TABLE "facts" ("id" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, "code" varchar(255) NOT NULL, "name" varchar(255) NOT NULL, "area" integer, "area_land" integer, "area_water" integer, "population" integer, "population_growth" float, "birth_rate" float, "death_rate" float, "migration_rate" float) |
# Write and run another query that returns the first 5 rows of
# the facts table in the database.
%%sql
SELECT *
FROM facts
LIMIT 5
* sqlite:///factbook.db Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
1 | af | Afghanistan | 652230 | 652230 | 0 | 32564342 | 2.32 | 38.57 | 13.89 | 1.51 |
2 | al | Albania | 28748 | 27398 | 1350 | 3029278 | 0.3 | 12.92 | 6.58 | 3.3 |
3 | ag | Algeria | 2381741 | 2381741 | 0 | 39542166 | 1.84 | 23.67 | 4.31 | 0.92 |
4 | an | Andorra | 468 | 468 | 0 | 85580 | 0.12 | 8.13 | 6.96 | 0.0 |
5 | ao | Angola | 1246700 | 1246700 | 0 | 19625353 | 2.78 | 38.78 | 11.49 | 0.46 |
The query shows the first five rows of the facts table in the factbook database.
Here are the descriptions for some of the columns:
name - The name of the country.
area - The total land and sea area of the country.
population - The country's population.
population_growth- The country's population growth as a percentage.
birth_rate - The country's birth rate, or the number of births a year per 1,000 people.
death_rate - The country's death rate, or the number of death a year per 1,000 people.
area- The country's total area (both land and water).
area_land - The country's land area in square kilometers.
area_water - The country's waterarea in square kilometers.
Let's start by calculating some summary statistics and look for any outlier countries.
# Write a single query that returns the:
# Minimum population
# Maximum population
# Minimum population growth
# Maximum population growth
%%sql
SELECT MIN(population) min_pop, MAX(population) max_pop, MIN(population_growth) min_pop_growth, MAX(population_growth) max_pop_growth
FROM facts
* sqlite:///factbook.db Done.
min_pop | max_pop | min_pop_growth | max_pop_growth |
---|---|---|---|
0 | 7256490011 | 0.0 | 4.02 |
A minimum population of 0 is something that needs further investigation to confirm.Also the maximum poulation for a single country is extremely high.
%%sql
SELECT *
FROM facts
WHERE population == (SELECT MIN(population) FROM facts)
* sqlite:///factbook.db Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
250 | ay | Antarctica | None | 280000 | None | 0 | None | None | None | None |
From the query above, it was observed that antarctica is the country with 0 population which is reasonable.
%%sql
SELECT *
FROM facts
WHERE population == (SELECT MAX(population) FROM facts)
* sqlite:///factbook.db Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
261 | xx | World | None | None | None | 7256490011 | 1.08 | 18.6 | 7.8 | None |
The maximum population is for the world.
%%sql
SELECT AVG(population) avg_pop, AVG(area) avg_area
FROM facts
* sqlite:///factbook.db Done.
avg_pop | avg_area |
---|---|
62094928.32231405 | 555093.546184739 |
# Write a query that finds all countries meeting both of the following conditions:
# The population is above average.
# The area is below average.
%%sql
SELECT name Country, population, area
FROM facts
WHERE population > (SELECT AVG(population)
FROM facts) AND
area < (SELECT AVG(area)
FROM facts)
* sqlite:///factbook.db Done.
Country | population | area |
---|---|---|
Bangladesh | 168957745 | 148460 |
Germany | 80854408 | 357022 |
Japan | 126919659 | 377915 |
Philippines | 100998376 | 300000 |
Thailand | 67976405 | 513120 |
United Kingdom | 64088222 | 243610 |
Vietnam | 94348835 | 331210 |
%%sql
SELECT name Country, population / area_land pop_density
FROM facts
ORDER BY pop_density DESC
LIMIT 10
* sqlite:///factbook.db Done.
Country | pop_density |
---|---|
Macau | 21168 |
Monaco | 15267 |
Singapore | 8259 |
Hong Kong | 6655 |
Gaza Strip | 5191 |
Gibraltar | 4876 |
Bahrain | 1771 |
Maldives | 1319 |
Malta | 1310 |
Bermuda | 1299 |
The query above lists the ten most densely populated countries.
That's it for the guided steps. Here are some next steps for you to explore:
Which countries have the highest ratios of water to land? Which countries have more water
than land?
Which countries will add the most people to their population next year?
Which countries have a higher death rate than birth rate?
Looking for feedback on your project? Or maybe you'd like to show it off? Head over to our Community to share your finished Guided Project! You can also view your peers' completed notebooks to see how they approached the project.
%%sql
SELECT name Countries, CAST(area_water as float)/CAST(area_land as float) water_land_ratio
FROM facts
ORDER BY water_land_ratio DESC
LIMIT 10
* sqlite:///factbook.db Done.
Countries | water_land_ratio |
---|---|
British Indian Ocean Territory | 905.6666666666666 |
Virgin Islands | 4.520231213872832 |
Puerto Rico | 0.5547914317925592 |
Bahamas, The | 0.3866133866133866 |
Guinea-Bissau | 0.2846728307254623 |
Malawi | 0.25939625850340137 |
Netherlands | 0.22571032366565366 |
Uganda | 0.22292237442922375 |
Eritrea | 0.16435643564356436 |
Liberia | 0.15623961794019933 |
%%sql
SELECT name Countries, area_water, area_land
FROM facts
WHERE area_water > area_land
* sqlite:///factbook.db Done.
Countries | area_water | area_land |
---|---|---|
British Indian Ocean Territory | 54340 | 60 |
Virgin Islands | 1564 | 346 |
%%sql
SELECT name Country, population_growth
FROM facts
ORDER BY population_growth DESC
LIMIT 15
* sqlite:///factbook.db Done.
Country | population_growth |
---|---|
South Sudan | 4.02 |
Malawi | 3.32 |
Burundi | 3.28 |
Niger | 3.25 |
Uganda | 3.24 |
Qatar | 3.07 |
Burkina Faso | 3.03 |
Mali | 2.98 |
Cook Islands | 2.95 |
Iraq | 2.93 |
Ethiopia | 2.89 |
Zambia | 2.88 |
Western Sahara | 2.82 |
Gaza Strip | 2.81 |
Tanzania | 2.79 |
%%sql
SELECT name Countries, death_rate, birth_rate
FROM facts
WHERE death_rate > birth_rate
* sqlite:///factbook.db Done.
Countries | death_rate | birth_rate |
---|---|---|
Austria | 9.42 | 9.41 |
Belarus | 13.36 | 10.7 |
Bosnia and Herzegovina | 9.75 | 8.87 |
Bulgaria | 14.44 | 8.92 |
Croatia | 12.18 | 9.45 |
Czech Republic | 10.34 | 9.63 |
Estonia | 12.4 | 10.51 |
Germany | 11.42 | 8.47 |
Greece | 11.09 | 8.66 |
Hungary | 12.73 | 9.16 |
Italy | 10.19 | 8.74 |
Japan | 9.51 | 7.93 |
Latvia | 14.31 | 10.0 |
Lithuania | 14.27 | 10.1 |
Moldova | 12.59 | 12.0 |
Monaco | 9.24 | 6.65 |
Poland | 10.19 | 9.74 |
Portugal | 11.02 | 9.27 |
Romania | 11.9 | 9.14 |
Russia | 13.69 | 11.6 |
Serbia | 13.66 | 9.08 |
Slovenia | 11.37 | 8.42 |
Ukraine | 14.46 | 10.72 |
Saint Pierre and Miquelon | 9.72 | 7.42 |