Our project is to explore & analyze demographic information our data comes from CIA World Factbook,a compendium of statistics about all of the countries on Earth The Factbook contains demographic information like:
%%capture
%load_ext sql
%sql sqlite:///factbook.db
'Connected: None@factbook.db'
%%sql
/* we need to explore sqlite_master database so we
will explore table first */
SELECT *
FROM sqlite_master
WHERE Type='table';
Done.
type | name | tbl_name | rootpage | sql |
---|---|---|---|---|
table | sqlite_sequence | sqlite_sequence | 3 | CREATE TABLE sqlite_sequence(name,seq) |
table | facts | facts | 47 | CREATE TABLE "facts" ("id" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, "code" varchar(255) NOT NULL, "name" varchar(255) NOT NULL, "area" integer, "area_land" integer, "area_water" integer, "population" integer, "population_growth" float, "birth_rate" float, "death_rate" float, "migration_rate" float) |
from code above we find that sqlite_master database contains from 2 table which is :
%%sql
/* let us explore 5 rows of facts table */
SELECT *
FROM facts
LIMIT 5;
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
1 | af | Afghanistan | 652230 | 652230 | 0 | 32564342 | 2.32 | 38.57 | 13.89 | 1.51 |
2 | al | Albania | 28748 | 27398 | 1350 | 3029278 | 0.3 | 12.92 | 6.58 | 3.3 |
3 | ag | Algeria | 2381741 | 2381741 | 0 | 39542166 | 1.84 | 23.67 | 4.31 | 0.92 |
4 | an | Andorra | 468 | 468 | 0 | 85580 | 0.12 | 8.13 | 6.96 | 0.0 |
5 | ao | Angola | 1246700 | 1246700 | 0 | 19625353 | 2.78 | 38.78 | 11.49 | 0.46 |
here are the description for some of the columns:
%%sql
/* explore Minimum, Maximum Population & population growth */
SELECT MIN(population) min_population,MAX(population) max_population,
MIN(population_growth) min_population_growth,
MAX(population_growth) max_population_growth
FROM facts;
Done.
min_population | max_population | min_population_growth | max_population_growth |
---|---|---|---|
0 | 7256490011 | 0.0 | 4.02 |
we explore that there are outlier countries
need to explore those countries
%%sql
-- find the countries having minimum population
SELECT *
FROM facts
WHERE population =(SELECT MIN(population) FROM facts);
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
250 | ay | Antarctica | None | 280000 | None | 0 | None | None | None | None |
population is 0 for Antarctica as it's not have countries & population on it in range 1,000 to 5,000 depending on season
%%sql
-- find countries which have maximum population
SELECT *
FROM facts
WHERE population= (SELECT MAX(population) FROM facts );
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
261 | xx | World | None | None | None | 7256490011 | 1.08 | 18.6 | 7.8 | None |
we discover that the maximum population value was reflect the total population overall the world
%%sql
SELECT MIN(population) min_population,
MAX(population) max_population,
MIN(population_growth) min_population_growth,MAX(population_growth) max_population_growth
FROM facts
WHERE name !='World';
Done.
min_population | max_population | min_population_growth | max_population_growth |
---|---|---|---|
0 | 1367485388 | 0.0 | 4.02 |
Now we find that max_population is 1,367,485,388 which is logic value , let's us discover it's country
%%sql
SELECT *
FROM facts
WHERE population=(
SELECT MAX(population) FROM facts
WHERE name != 'World');
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
37 | ch | China | 9596960 | 9326410 | 270550 | 1367485388 | 0.45 | 12.49 | 7.53 | 0.44 |
Now we find China have the maximum population 1,367,485,388 with population growth rate 0.45
%%sql
-- calculate the average value for both population & area
SELECT AVG(population) population_avg, AVG(area) avg_area
FROM facts;
Done.
population_avg | avg_area |
---|---|
62094928.32231405 | 555093.546184739 |
Now let us explore which countries are densely populated by find:
%%sql
SELECT *
FROM facts
WHERE population > (SELECT AVG(population) FROM facts
WHERE name<> 'World')
AND area < (SELECT AVG(area) FROM facts
WHERE name <> 'World')
ORDER BY population DESC;
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
14 | bg | Bangladesh | 148460 | 130170 | 18290 | 168957745 | 1.6 | 21.14 | 5.61 | 0.46 |
85 | ja | Japan | 377915 | 364485 | 13430 | 126919659 | 0.16 | 7.93 | 9.51 | 0.0 |
138 | rp | Philippines | 300000 | 298170 | 1830 | 100998376 | 1.61 | 24.27 | 6.11 | 2.09 |
192 | vm | Vietnam | 331210 | 310070 | 21140 | 94348835 | 0.97 | 15.96 | 5.93 | 0.3 |
65 | gm | Germany | 357022 | 348672 | 8350 | 80854408 | 0.17 | 8.47 | 11.42 | 1.24 |
173 | th | Thailand | 513120 | 510890 | 2230 | 67976405 | 0.34 | 11.19 | 7.8 | 0.0 |
185 | uk | United Kingdom | 243610 | 241930 | 1680 | 64088222 | 0.54 | 12.17 | 9.35 | 2.54 |
83 | it | Italy | 301340 | 294140 | 7200 | 61855120 | 0.27 | 8.74 | 10.19 | 4.1 |
91 | ks | Korea, South | 99720 | 96920 | 2800 | 49115196 | 0.14 | 8.19 | 6.75 | 0.0 |
163 | sp | Spain | 505370 | 498980 | 6390 | 48146134 | 0.89 | 9.64 | 9.04 | 8.31 |
139 | pl | Poland | 312685 | 304255 | 8430 | 38562189 | 0.09 | 9.74 | 10.19 | 0.46 |
182 | ug | Uganda | 241038 | 197100 | 43938 | 37101745 | 3.24 | 43.79 | 10.69 | 0.74 |
80 | iz | Iraq | 438317 | 437367 | 950 | 37056169 | 2.93 | 31.45 | 3.77 | 1.62 |
120 | mo | Morocco | 446550 | 446300 | 250 | 33322699 | 1.0 | 18.2 | 4.81 | 3.36 |
%%sql
SELECT COUNT(*) num_densely_populated
FROM facts
WHERE population > (SELECT AVG(population) FROM facts
WHERE name <> 'World')
AND area < (SELECT AVG(area) FROM facts
WHERE name <>'World');
Done.
num_densely_populated |
---|
14 |
From above we find that only 14 countries are densely populated sorted by population descinding Bangladesh to Morocco
%%sql
-- Country has the most people?
SELECT *
FROM facts
WHERE population =(SELECT MAX(population) FROM facts
WHERE name <> 'World');
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
37 | ch | China | 9596960 | 9326410 | 270550 | 1367485388 | 0.45 | 12.49 | 7.53 | 0.44 |
Note: we find this result above but let us work as extra question:
%%sql
-- Country has the highest growth rate
SELECT *
FROM facts
WHERE population_growth = (SELECT MAX(population_growth)
FROM facts
WHERE name != 'World' );
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
162 | od | South Sudan | 644329 | None | None | 12042910 | 4.02 | 36.91 | 8.18 | 11.47 |
South Sudan has the highest growth rate 4.02 but it's not from the country which are densely populated may be as:
it's population is less than than population avearge might be as migration_ rate is 11.47
it's area is greater than area_avearge
%%sql
SELECT *, CAST(area_water AS FLOAT) / CAST(area AS FLOAT) AS water_ratio
FROM facts
WHERE water_ratio > 0.5 AND
name <> 'World'
ORDER BY water_ratio DESC;
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate | water_ratio |
---|---|---|---|---|---|---|---|---|---|---|---|
228 | io | British Indian Ocean Territory | 54400 | 60 | 54340 | None | None | None | None | None | 0.9988970588235294 |
247 | vq | Virgin Islands | 1910 | 346 | 1564 | 103574 | 0.59 | 10.31 | 8.54 | 7.67 | 0.818848167539267 |
%%sql
--Which country has maximum birth rate
SELECT *
FROM facts
WHERE birth_rate=(SELECT MAX(birth_rate) FROM facts
WHERE name <> 'World');
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
128 | ng | Niger | None | 1266700 | 300 | 18045729 | 3.25 | 45.45 | 12.42 | 0.56 |
we find that Niger has the maximum birth_rate 45.45 it's population 18045729 sharing area 1266700
%%sql
SELECT *
FROM facts
WHERE death_rate > birth_rate AND
name <> 'World'
ORDER BY death_rate - birth_rate DESC;
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
26 | bu | Bulgaria | 110879 | 108489 | 2390 | 7186893 | 0.58 | 8.92 | 14.44 | 0.29 |
153 | ri | Serbia | 77474 | 77474 | 0 | 7176794 | 0.46 | 9.08 | 13.66 | 0.0 |
96 | lg | Latvia | 64589 | 62249 | 2340 | 1986705 | 1.06 | 10.0 | 14.31 | 6.26 |
102 | lh | Lithuania | 65300 | 62680 | 2620 | 2884433 | 1.04 | 10.1 | 14.27 | 6.27 |
183 | up | Ukraine | 603550 | 579330 | 24220 | 44429471 | 0.6 | 10.72 | 14.46 | 2.25 |
75 | hu | Hungary | 93028 | 89608 | 3420 | 9897541 | 0.22 | 9.16 | 12.73 | 1.33 |
65 | gm | Germany | 357022 | 348672 | 8350 | 80854408 | 0.17 | 8.47 | 11.42 | 1.24 |
158 | si | Slovenia | 20273 | 20151 | 122 | 1983412 | 0.26 | 8.42 | 11.37 | 0.37 |
142 | ro | Romania | 238391 | 229891 | 8500 | 21666350 | 0.3 | 9.14 | 11.9 | 0.24 |
44 | hr | Croatia | 56594 | 55974 | 620 | 4464844 | 0.13 | 9.45 | 12.18 | 1.39 |
16 | bo | Belarus | 207600 | 202900 | 4700 | 9589689 | 0.2 | 10.7 | 13.36 | 0.7 |
117 | mn | Monaco | 2 | 2 | 0 | 30535 | 0.12 | 6.65 | 9.24 | 3.83 |
67 | gr | Greece | 131957 | 130647 | 1310 | 10775643 | 0.01 | 8.66 | 11.09 | 2.32 |
214 | sb | Saint Pierre and Miquelon | 242 | 242 | 0 | 5657 | 1.08 | 7.42 | 9.72 | 8.49 |
143 | rs | Russia | 17098242 | 16377742 | 720500 | 142423773 | 0.04 | 11.6 | 13.69 | 1.69 |
57 | en | Estonia | 45228 | 42388 | 2840 | 1265420 | 0.55 | 10.51 | 12.4 | 3.6 |
140 | po | Portugal | 92090 | 91470 | 620 | 10825309 | 0.09 | 9.27 | 11.02 | 2.67 |
85 | ja | Japan | 377915 | 364485 | 13430 | 126919659 | 0.16 | 7.93 | 9.51 | 0.0 |
83 | it | Italy | 301340 | 294140 | 7200 | 61855120 | 0.27 | 8.74 | 10.19 | 4.1 |
22 | bk | Bosnia and Herzegovina | 51197 | 51187 | 10 | 3867055 | 0.13 | 8.87 | 9.75 | 0.38 |
47 | ez | Czech Republic | 78867 | 77247 | 1620 | 10644842 | 0.16 | 9.63 | 10.34 | 2.33 |
116 | md | Moldova | 33851 | 32891 | 960 | 3546847 | 1.03 | 12.0 | 12.59 | 9.67 |
139 | pl | Poland | 312685 | 304255 | 8430 | 38562189 | 0.09 | 9.74 | 10.19 | 0.46 |
10 | au | Austria | 83871 | 82445 | 1426 | 8665550 | 0.55 | 9.41 | 9.42 | 5.56 |
table above clear countries which have death_rate > birth_rate sorting by the differance in descending order , Bulgaria have the maximum differance between death & birth rate
%%sql
SELECT * , CAST(population AS FLOAT) / CAST(area AS FLOAT) density_population
FROM facts
WHERE population >(SELECT AVG(population) FROM facts
WHERE name <> 'World')
AND area < (SELECT AVG(area) FROM facts
WHERE name <> 'World')
ORDER BY density_population DESC;
Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate | density_population |
---|---|---|---|---|---|---|---|---|---|---|---|
14 | bg | Bangladesh | 148460 | 130170 | 18290 | 168957745 | 1.6 | 21.14 | 5.61 | 0.46 | 1138.0691432035565 |
91 | ks | Korea, South | 99720 | 96920 | 2800 | 49115196 | 0.14 | 8.19 | 6.75 | 0.0 | 492.53104693140796 |
138 | rp | Philippines | 300000 | 298170 | 1830 | 100998376 | 1.61 | 24.27 | 6.11 | 2.09 | 336.6612533333333 |
85 | ja | Japan | 377915 | 364485 | 13430 | 126919659 | 0.16 | 7.93 | 9.51 | 0.0 | 335.8418136353413 |
192 | vm | Vietnam | 331210 | 310070 | 21140 | 94348835 | 0.97 | 15.96 | 5.93 | 0.3 | 284.8610700160019 |
185 | uk | United Kingdom | 243610 | 241930 | 1680 | 64088222 | 0.54 | 12.17 | 9.35 | 2.54 | 263.0771396904889 |
65 | gm | Germany | 357022 | 348672 | 8350 | 80854408 | 0.17 | 8.47 | 11.42 | 1.24 | 226.4689795026637 |
83 | it | Italy | 301340 | 294140 | 7200 | 61855120 | 0.27 | 8.74 | 10.19 | 4.1 | 205.26687462666754 |
182 | ug | Uganda | 241038 | 197100 | 43938 | 37101745 | 3.24 | 43.79 | 10.69 | 0.74 | 153.92487906471177 |
173 | th | Thailand | 513120 | 510890 | 2230 | 67976405 | 0.34 | 11.19 | 7.8 | 0.0 | 132.47662340193327 |
139 | pl | Poland | 312685 | 304255 | 8430 | 38562189 | 0.09 | 9.74 | 10.19 | 0.46 | 123.3259958104802 |
163 | sp | Spain | 505370 | 498980 | 6390 | 48146134 | 0.89 | 9.64 | 9.04 | 8.31 | 95.26907810119319 |
80 | iz | Iraq | 438317 | 437367 | 950 | 37056169 | 2.93 | 31.45 | 3.77 | 1.62 | 84.5419388250969 |
120 | mo | Morocco | 446550 | 446300 | 250 | 33322699 | 1.0 | 18.2 | 4.81 | 3.36 | 74.6225484268279 |