from google.colab import drive
drive.mount('/content/drive')
Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
##%%capture
%load_ext sql
%sql sqlite:///factbook.db
The sql extension is already loaded. To reload it, use: %reload_ext sql
'Connected: @factbook.db'
%sql SELECT * FROM sqlite_master WHERE type='table';
* sqlite:///factbook.db Done.
type | name | tbl_name | rootpage | sql |
---|---|---|---|---|
table | sqlite_sequence | sqlite_sequence | 3 | CREATE TABLE sqlite_sequence(name,seq) |
table | facts | facts | 47 | CREATE TABLE "facts" ("id" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, "code" varchar(255) NOT NULL, "name" varchar(255) NOT NULL, "area" integer, "area_land" integer, "area_water" integer, "population" integer, "population_growth" float, "birth_rate" float, "death_rate" float, "migration_rate" float) |
%%sql
SELECT * FROM sqlite_master WHERE type='table';
* sqlite:///factbook.db Done.
type | name | tbl_name | rootpage | sql |
---|---|---|---|---|
table | sqlite_sequence | sqlite_sequence | 3 | CREATE TABLE sqlite_sequence(name,seq) |
table | facts | facts | 47 | CREATE TABLE "facts" ("id" INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL, "code" varchar(255) NOT NULL, "name" varchar(255) NOT NULL, "area" integer, "area_land" integer, "area_water" integer, "population" integer, "population_growth" float, "birth_rate" float, "death_rate" float, "migration_rate" float) |
%%sql
SELECT * FROM facts LIMIT 5
* sqlite:///factbook.db Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
1 | af | Afghanistan | 652230 | 652230 | 0 | 32564342 | 2.32 | 38.57 | 13.89 | 1.51 |
2 | al | Albania | 28748 | 27398 | 1350 | 3029278 | 0.3 | 12.92 | 6.58 | 3.3 |
3 | ag | Algeria | 2381741 | 2381741 | 0 | 39542166 | 1.84 | 23.67 | 4.31 | 0.92 |
4 | an | Andorra | 468 | 468 | 0 | 85580 | 0.12 | 8.13 | 6.96 | 0.0 |
5 | ao | Angola | 1246700 | 1246700 | 0 | 19625353 | 2.78 | 38.78 | 11.49 | 0.46 |
Here are the descriptions for some of the columns:
%%sql
SELECT MIN(population) min_pop, MAX(population) max_pop, MIN(population) population_growth, MAX(population_growth) max_pop_growth
FROM facts
* sqlite:///factbook.db Done.
min_pop | max_pop | population_growth | max_pop_growth |
---|---|---|---|
0 | 7256490011 | 0 | 4.02 |
Seems like there are somew outliers with 0 and 7256490011 population, let's zoom in
%%sql
SELECT *
FROM facts
WHERE population = 0
* sqlite:///factbook.db Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
250 | ay | Antarctica | None | 280000 | None | 0 | None | None | None | None |
%%sql
SELECT *
FROM facts
WHERE population = 7256490011
* sqlite:///factbook.db Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
261 | xx | World | None | None | None | 7256490011 | 1.08 | 18.6 | 7.8 | None |
It seems like the table contains a row for the whole world, which explains the population of over 7.2 billion. It also seems like the table contains a row for Antarctica, which explains the population of 0. This seems to match the CIA Factbook page for Antarctica
%%sql
SELECT AVG(population), AVG(area)
FROM facts
* sqlite:///factbook.db Done.
AVG(population) | AVG(area) |
---|---|
62094928.32231405 | 555093.546184739 |
Let's find densed populated countries (population more than average and area less then average):
%%sql
SELECT *
FROM facts
WHERE population > (SELECT AVG(population) FROM facts)
AND area < (SELECT AVG(area) FROM facts)
* sqlite:///factbook.db Done.
id | code | name | area | area_land | area_water | population | population_growth | birth_rate | death_rate | migration_rate |
---|---|---|---|---|---|---|---|---|---|---|
14 | bg | Bangladesh | 148460 | 130170 | 18290 | 168957745 | 1.6 | 21.14 | 5.61 | 0.46 |
65 | gm | Germany | 357022 | 348672 | 8350 | 80854408 | 0.17 | 8.47 | 11.42 | 1.24 |
85 | ja | Japan | 377915 | 364485 | 13430 | 126919659 | 0.16 | 7.93 | 9.51 | 0.0 |
138 | rp | Philippines | 300000 | 298170 | 1830 | 100998376 | 1.61 | 24.27 | 6.11 | 2.09 |
173 | th | Thailand | 513120 | 510890 | 2230 | 67976405 | 0.34 | 11.19 | 7.8 | 0.0 |
185 | uk | United Kingdom | 243610 | 241930 | 1680 | 64088222 | 0.54 | 12.17 | 9.35 | 2.54 |
192 | vm | Vietnam | 331210 | 310070 | 21140 | 94348835 | 0.97 | 15.96 | 5.93 | 0.3 |
Bangladesh, Germany, Japan, Philippines, Thailand, UK and Vietnam are densed populated
%%sql
SELECT CAST(area_water AS FLOAT) / area_land AS water_land_ratio, name as country
FROM facts
WHERE area_water !=0
ORDER BY water_land_ratio DESC
LIMIT 10
* sqlite:///factbook.db Done.
water_land_ratio | country |
---|---|
905.6666666666666 | British Indian Ocean Territory |
4.520231213872832 | Virgin Islands |
0.5547914317925592 | Puerto Rico |
0.3866133866133866 | Bahamas, The |
0.2846728307254623 | Guinea-Bissau |
0.25939625850340137 | Malawi |
0.22571032366565366 | Netherlands |
0.22292237442922375 | Uganda |
0.16435643564356436 | Eritrea |
0.15623961794019933 | Liberia |
British Indian Ocean Territory has the highes water to land ratio
%%sql
SELECT area_water, area_land, name as country_name
FROM facts
WHERE area_water > area_land
* sqlite:///factbook.db Done.
area_water | area_land | country_name |
---|---|---|
54340 | 60 | British Indian Ocean Territory |
1564 | 346 | Virgin Islands |
British Indian Ocean Territory and Virgin Islands have more water than land
%%sql
SELECT population_growth, name as country_name
FROM facts
ORDER BY population_growth DESC
LIMIT 10
* sqlite:///factbook.db Done.
population_growth | country_name |
---|---|
4.02 | South Sudan |
3.32 | Malawi |
3.28 | Burundi |
3.25 | Niger |
3.24 | Uganda |
3.07 | Qatar |
3.03 | Burkina Faso |
2.98 | Mali |
2.95 | Cook Islands |
2.93 | Iraq |
%%sql
SELECT death_rate, birth_rate, population_growth, ROUND(death_rate - birth_rate, 2) AS rate_diff, name as country_name
FROM facts
WHERE death_rate > birth_rate
ORDER BY rate_diff DESC
* sqlite:///factbook.db Done.
death_rate | birth_rate | population_growth | rate_diff | country_name |
---|---|---|---|---|
14.44 | 8.92 | 0.58 | 5.52 | Bulgaria |
13.66 | 9.08 | 0.46 | 4.58 | Serbia |
14.31 | 10.0 | 1.06 | 4.31 | Latvia |
14.27 | 10.1 | 1.04 | 4.17 | Lithuania |
14.46 | 10.72 | 0.6 | 3.74 | Ukraine |
12.73 | 9.16 | 0.22 | 3.57 | Hungary |
11.42 | 8.47 | 0.17 | 2.95 | Germany |
11.37 | 8.42 | 0.26 | 2.95 | Slovenia |
11.9 | 9.14 | 0.3 | 2.76 | Romania |
12.18 | 9.45 | 0.13 | 2.73 | Croatia |
13.36 | 10.7 | 0.2 | 2.66 | Belarus |
9.24 | 6.65 | 0.12 | 2.59 | Monaco |
11.09 | 8.66 | 0.01 | 2.43 | Greece |
9.72 | 7.42 | 1.08 | 2.3 | Saint Pierre and Miquelon |
13.69 | 11.6 | 0.04 | 2.09 | Russia |
12.4 | 10.51 | 0.55 | 1.89 | Estonia |
11.02 | 9.27 | 0.09 | 1.75 | Portugal |
9.51 | 7.93 | 0.16 | 1.58 | Japan |
10.19 | 8.74 | 0.27 | 1.45 | Italy |
9.75 | 8.87 | 0.13 | 0.88 | Bosnia and Herzegovina |
10.34 | 9.63 | 0.16 | 0.71 | Czech Republic |
12.59 | 12.0 | 1.03 | 0.59 | Moldova |
10.19 | 9.74 | 0.09 | 0.45 | Poland |
9.42 | 9.41 | 0.55 | 0.01 | Austria |
Bulgaria has the highest death rate and birth rate difference