As data analysts for a company that builds Android and iOS mobile apps, we aim to find mobile app profiles that are potentially profitable for both App Store and Google Play markets. This project enables our team of developers to make data-driven decisions with respect to the kind of apps they build.
As our company only build apps that are free to download and install, our main source of revenue comes from in-app advertisements. Since advertising-merchants are heavily influenced by the number of users of any given app, the focus of our project will be analysing data so as to identify the kinds of apps that are likely to attract more users.
As of September 2018, the number of iOS apps available on the App Store and the number of Android apps on Google Play were approximately 2 million and 2.1 million respectively.
Prior to commiting a significant amount of time and money to collect data for over four million apps, we would try to:
For our purpose, the following sets of data are deemed suitable:
A data set containing data on approximately ten thousand Android apps from Google Play, and that can be downloaded directly from this link.
A data set containing data on approximately seven thousand iOS apps from the App Store, also downloadabledirectly from this other link.
We shall begin by opening the two data sets, before moving on to explore the data.
def open_dataset(file_name):
opened_file = open(file_name)
from csv import reader
read_file = reader(opened_file)
data = list(read_file)
return data
The open_dataset() function is the first of many that we'll be writing while exploring the data sets. These functions will allow us to efficiently and repeatedly access different perspectives with greater ease.
After opening the data sets, we separate the headers from the bulk of data, and get to know the number of columns (based on the headers) and the number of rows (of the data bodies), by simply giving them identifiable variable names. The header also let us know the type of information within the data bodies, as it is the latter that we are keen on exploring.
android_dataset = open_dataset('googleplaystore.csv')
android_header = android_dataset[0]
android_data = android_dataset[1:]
ios_dataset = open_dataset('AppleStore.csv')
ios_header = ios_dataset[0]
ios_data = ios_dataset[1:]
print(android_header, '\n')
print("Number of rows in Google's dataset: " + str(len(android_data)) + " (excluding header)")
print("Number of columns: " + str(len(android_header)), '\n')
print(ios_header, '\n')
print("Number of rows in Apple's dataset: " + str(len(ios_data)) + " (excluding header)")
print("Number of columns: " + str(len(ios_header)))
['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver'] Number of rows in Google's dataset: 10841 (excluding header) Number of columns: 13 ['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic'] Number of rows in Apple's dataset: 7197 (excluding header) Number of columns: 16
The Google Play data set has 10841 rows and 13 columns of apps data. At a quick glance, columns that might be useful for analyses are 'App', 'Category', 'Rating', 'Reviews', 'Installs', 'Type', 'Price', 'Content Rating' and 'Genres'.
The Apple Store data set has 7197 rows and 16 columns of apps data. Useful columns might include 'track_name', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'cont_rating' and 'prime_genre'.
Not all column names are self-explanatory in this case, but details about each column can be found in the data set documentation.
The Google Play data set has a dedicated discussion section, and within one of the discussions an error for row 10472 has been outlined.
Tto scan both sets of data, we use another function that will help us locate rows with missing column data.
def row_check(file_header, file_data):
length_header = len(file_header)
error_rows = []
for row in file_data:
if len(row) != length_header:
error_rows.append(file_data.index(row))
return error_rows
print("Google's row-error found at: ", row_check(android_header, android_data))
print("Apple's row_error found at: ", row_check(ios_header, ios_data))
Google's row-error found at: [10472] Apple's row_error found at: None
It appears that the number of columns in Apple Store data body matches that of its header. Let's print this row from Google Play data set and compare it against both its header and another row that is correct.
print(android_header, '\n') # header
print(android_data[0], '\n') # correct row
print(android_data[10472]) # incorrect row
['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver'] ['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up'] ['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up']
The row 10472 corresponds to the app "Life Made WI-Fi Touchscreen Photo Frame". We realized that the rating is 19, which is clearly off because the maximum rating for a Google Play app is 5 (as mentioned in the discussions section). As highlighted, this problem is caused by a missing value in the 'Category' column. As such, we'll delete this row.
print(len(android_data))
del android_data[10472] # don't run this more than once
print(len(android_data))
10841 10840
Besides missing column(s) among the rows of data, there could be duplicated entries of the same app too, hence requiring further cleaning of data prior to analysis.
For a start, we extract a list of apps from each data set, whereby each app is only listed once.
def extract(indexed_column, data_set):
column_list = []
for row in data_set:
app = row[indexed_column]
column_list.append(app)
return column_list
android_n_apps = set(extract(0, android_data))
ios_n_apps = set(extract(1, ios_data))
print(len(android_n_apps), len(ios_n_apps))
9659 7195
There are altogether 9659 unique apps in Google Play and 7195 unique apps in Apple Store.
Next, we use another function to separate each data set into two lists as follows:
def multiple_entries(indexed_column, data_set):
duplicate_apps = []
unique_apps = []
for row in data_set:
app_name = row[indexed_column]
if app_name in unique_apps:
duplicate_apps.append(row)
else:
unique_apps.append(app_name)
return unique_apps, duplicate_apps
android_unique, android_duplicates = multiple_entries(0, android_data)
print('Number of duplicate google apps:', len(android_duplicates))
print('Number of unique google apps:', len(android_unique), '\n')
ios_unique, ios_duplicates = multiple_entries(1, ios_data)
print('Number of duplicate apple apps:', len(ios_duplicates))
print('Number of unique apple apps:', len(ios_unique))
Number of duplicate google apps: 1181 Number of unique google apps: 9659 Number of duplicate apple apps: 2 Number of unique apple apps: 7195
For Google, there are 1,181 cases where various apps occur more than once, e.g. the application Instagram has four entries:
for item in android_data:
app = item[0]
if app == 'Instagram':
print(item, '\n')
['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'] ['Instagram', 'SOCIAL', '4.5', '66577446', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'] ['Instagram', 'SOCIAL', '4.5', '66577313', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device'] ['Instagram', 'SOCIAL', '4.5', '66509917', 'Varies with device', '1,000,000,000+', 'Free', '0', 'Teen', 'Social', 'July 31, 2018', 'Varies with device', 'Varies with device']
print('Examples of other duplicate apps: ', extract(0, android_duplicates[:15]))
Examples of other duplicate apps: ['Quick PDF Scanner + OCR FREE', 'Box', 'Google My Business', 'ZOOM Cloud Meetings', 'join.me - Simple Meetings', 'Box', 'Zenefits', 'Google Ads', 'Google My Business', 'Slack', 'FreshBooks Classic', 'Insightly CRM', 'QuickBooks Accounting: Invoicing & Expenses', 'HipChat - Chat Built for Teams', 'Xero Accounting Software']
However, contrary to the observations made in the discussions, whereby no duplicates had been found for Apple, the multiple_entries() function discovered 2 duplicates in the latter.
for item in ios_duplicates:
print(item[1])
for row in ios_data:
name = row[1]
if name == 'Mannequin Challenge' or name == 'VR Roller Coaster':
print('\n', row)
Mannequin Challenge VR Roller Coaster ['1173990889', 'Mannequin Challenge', '109705216', 'USD', '0.0', '668', '87', '3.0', '3.0', '1.4', '9+', 'Games', '37', '4', '1', '1'] ['952877179', 'VR Roller Coaster', '169523200', 'USD', '0.0', '107', '102', '3.5', '3.5', '2.0.0', '4+', 'Games', '37', '5', '1', '1'] ['1178454060', 'Mannequin Challenge', '59572224', 'USD', '0.0', '105', '58', '4.0', '4.5', '1.0.1', '4+', 'Games', '38', '5', '1', '1'] ['1089824278', 'VR Roller Coaster', '240964608', 'USD', '0.0', '67', '44', '3.5', '4.0', '0.81', '4+', 'Games', '38', '0', '1', '1']
To avoid counting certain apps more than once during our analyses, we need to remove the duplicate entries and keep only one entry per app. However, instead of removing the duplicate rows randomly, we could choose a criterion to decide which entries to keep.
Looking at 'Instagram', 'Mannequin' and 'VR Roller Coaster', the main difference among the duplicates lies in the number of reviews, i.e. fourth position for Google and sixth position for Apple.
The different numbers show that the review-data was collected at different times. The higher the number of reviews, the more reliable the ratings. As such, we can use reviews as a criterion, keeping the row with the highest numbers for each app.
To do that, we will:
def max_reviewers(file_name, index_1, index_2): # duplicate-free data set based on max no. of reviewers
reviews_max = {}
data_cleaned = []
already_added = []
for app in file_name:
name, n_reviews = app[index_1], float(app[index_2])
if name in reviews_max and (reviews_max[name] < n_reviews):
reviews_max[name] = n_reviews
elif name not in reviews_max:
reviews_max[name] = n_reviews
if (reviews_max[name] == n_reviews) and (name not in already_added):
data_cleaned.append(app)
already_added.append(name)
return data_cleaned
Let's start by creating a function max_reviewers() to build the dictionary reviews_max that keep the entries with the highest number of reviews and removes duplicates. In the code cell below:
android_cleaned = max_reviewers(android_data, 0, 3)
ios_cleaned = max_reviewers(ios_data, 1, 5)
print(len(android_cleaned), len(ios_cleaned))
9659 7195
After removing the duplicates, the number of rows in each data set corresponds to their respective number of unique apps.
However, another disparity was found when comparing the first three rows of the new android data set.
for item in android_cleaned[0:3]:
print(item, '\n')
['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up'] ['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'] ['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up']
The first three apps shown in the 'Solutions' are 'Photo Editor & Candy Camera & Grid & ScrapBook', 'U Launcher Lite – FREE Live Cool Themes, Hide Apps' and 'Sketch - Draw & Paint'.
But the first three apps shown after using max_reviewers() includes 'Coloring book moana', which is absent in the 'Solutions' using the function explore_data(), i.e. 'explore_data(android_clean, 0, 3, True)'.
It's also worth noting that this app happens to fall under two different categories. Keeping the one with higher number of reviews, the app will be removed from the 'Family' category.
for item in android_data:
app = item[0]
if app == 'Coloring book moana':
print(item, '\n')
['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'] ['Coloring book moana', 'FAMILY', '3.9', '974', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up']
print(ios_cleaned[813][1])
print(ios_cleaned[6731][1], '\n')
print(android_cleaned[5326][0])
print(android_cleaned[8938][0])
爱奇艺PPS -《欢乐颂2》电视剧热播 激ムズ!和のひとふで書き! 〜頭をつかう脳トレパズルゲーム〜 뽕티비 - 개인방송, 인터넷방송, BJ방송 Ey Sey Storytime រឿងនិទានតាឥសី
As our focus is on English apps, we would need to remove the non-English ones.
One way to go about this is to remove each app whose name contains a symbol that is not commonly used in English text — English text usually includes letters from the English alphabet, numbers composed of digits from 0 to 9, punctuation marks (., !, ?, ;, etc.), and other symbols (+, *, /, etc.).
All these characters that are specific to English texts are encoded using the ASCII standard. Each ASCII character has a corresponding number between 0 and 127 associated with it, and we can take advantage of that to build a function that checks an app name and tells us whether it contains non-ASCII characters.
Using the built-in ord() functio, we built the function below to find out the corresponding encoding number of each character.
def is_english(string):
for character in string:
if ord(character) > 127:
return False
return True
print(is_english('Instagram'))
print(is_english('爱奇艺PPS -《欢乐颂2》电视剧热播'))
True False
While the function seems to work fine, some English app names that use emojis or other symbols (™, — (em dash), – (en dash), etc.) will fall outside of the ASCII range. As such, we'll remove useful apps if we use the function in its current form.
print(is_english('Docs To Go™ Free Office Suite'))
print(is_english('Instachat 😜'))
print(ord('™'))
print(ord('😜'))
False False 8482 128540
To minimize the impact of data loss, we'll only remove an app if its name has more than three non-ASCII characters.
In order to further filter those with three non-ASCII characters, we included another condition that compares the number of non-ASCII characters against the number of characters in the full name of the apps. If it is a match, e.g. where two Chinese characters form the name of the app, we shall remove these too.
The function is still not perfect, and very few non-English apps might get past our filter, but this seems good enough at this point in our analysis.
def is_english(string):
non_ascii = 0
for character in string:
if ord(character) > 127:
non_ascii += 1
if non_ascii > 3 or (non_ascii <= 3 and non_ascii == len(string)):
return False
else:
return True
print(is_english('Docs To Go™ Free Office Suite'))
print(is_english('Instachat 😜'))
print(is_english('爱奇艺PPS -《欢乐颂2》电视剧热播'))
print(is_english('豆瓣'))
print(is_english('教えて!goo'))
True True False False True
android_english = []
ios_english = []
for app in android_cleaned:
name = app[0]
if is_english(name):
android_english.append(app)
for app in ios_cleaned:
name = app[1]
if is_english(name):
ios_english.append(app)
print(len(android_english), len(ios_english))
9614 6163
We can see that we're left with 9614 Android apps and 6163 iOS apps; the latter is fewer than that shown in the 'Solutions' by 20 apps-data due to the additional condition.
As mentioned in the introduction, our company only build apps that are free to download and install, and our main source of revenue consists of in-app ads. Our data sets contain both free and non-free apps, and we'll need to isolate only the free apps for our analysis. Below, we isolate the free apps for both our data sets.
android_final = []
for app in android_english:
price = app[7]
if price == '0' or price == '0.0':
android_final.append(app)
ios_final = []
for app in ios_english:
price = app[4]
if price == '0' or price == '0.0':
ios_final.append(app)
print(len(android_final), len(ios_final))
8862 3204
We're left with 8862 Android apps and 3204 iOS apps, which should be adequate for our analysis. Once again, these numbers differ from those in the 'Solutions' (i.e. 8864 for Android and 3222 for iOS).
As mentioned in the introduction, our aim is to determine the kinds of apps that are likely to attract more users because our revenue is highly influenced by the number of people using our apps.
Because our end goal is to add the app on both the App Store and Google Play, we need to find app profiles that are successful on both markets.
Let's begin the analysis by getting a sense of the most common genres for each market. For a start, we'll take a look at the prime_genre column of the App Store data set, and the Genres and Category columns of the Google Play data set.
android_genres = set(extract(9, android_final))
ios_genres = set(extract(11, ios_final))
android_categories = set(extract(1, android_final))
print('Number of android app genres: ', len(android_genres))
print('Number of android app categories: ', len(android_categories))
print('Number of iOS app genres: ', len(ios_genres))
Number of android app genres: 114 Number of android app categories: 33 Number of iOS app genres: 23
For Android apps, the difference between the Genres and the Category columns is not crystal clear. One thing for sure, the Genres column is much more granular (it has more categories). Since, we're only looking for the bigger picture at the moment, we'll only work with the Category column moving forward.
We'll build frequency table for the prime_genre column of the App Store data set and the Category columns of the Google Play data set.
Thereafter, to analyze the frequency tables, we'll build:
def freq_table(dataset, index):
table = {}
total = 0
for row in dataset:
total += 1
item = row[index]
if item in table:
table[item] += 1
else:
table[item] = 1
percentage_table = {}
for item in table:
percentage = (table[item] / total) * 100
percentage_table[item] = percentage
return percentage_table
ios_gen_ft = freq_table(ios_final, 11)
import operator
ios_gen_ft_s = sorted(ios_gen_ft.items(), key=operator.itemgetter(1), reverse = True)
for entry in ios_gen_ft_s:
print(entry[0], ":", round(entry[1], 3))
Games : 58.146 Entertainment : 7.896 Photo & Video : 4.994 Education : 3.683 Social Networking : 3.246 Shopping : 2.622 Utilities : 2.528 Sports : 2.154 Music : 2.06 Health & Fitness : 2.029 Productivity : 1.748 Lifestyle : 1.561 News : 1.342 Travel : 1.217 Finance : 1.092 Weather : 0.874 Food & Drink : 0.811 Reference : 0.562 Business : 0.531 Book : 0.406 Navigation : 0.187 Medical : 0.187 Catalogs : 0.125
android_cat_ft = freq_table(android_final, 1) # Focusing on ‘categories’ instead of 'genres'.
import operator
android_cat_ft_s = sorted(android_cat_ft.items(), key=operator.itemgetter(1), reverse = True)
for entry in android_cat_ft_s:
print(entry[0], ":", round(entry[1], 3))
FAMILY : 18.45 GAME : 9.874 TOOLS : 8.441 BUSINESS : 4.593 LIFESTYLE : 3.904 PRODUCTIVITY : 3.893 FINANCE : 3.701 MEDICAL : 3.521 SPORTS : 3.397 PERSONALIZATION : 3.318 COMMUNICATION : 3.239 HEALTH_AND_FITNESS : 3.081 PHOTOGRAPHY : 2.945 NEWS_AND_MAGAZINES : 2.798 SOCIAL : 2.663 TRAVEL_AND_LOCAL : 2.336 SHOPPING : 2.246 BOOKS_AND_REFERENCE : 2.144 DATING : 1.862 VIDEO_PLAYERS : 1.783 MAPS_AND_NAVIGATION : 1.399 EDUCATION : 1.286 FOOD_AND_DRINK : 1.241 ENTERTAINMENT : 1.128 LIBRARIES_AND_DEMO : 0.937 AUTO_AND_VEHICLES : 0.925 HOUSE_AND_HOME : 0.835 WEATHER : 0.801 EVENTS : 0.711 ART_AND_DESIGN : 0.677 PARENTING : 0.654 COMICS : 0.621 BEAUTY : 0.598
App Store is seemingly dominated by apps meant for entertainment, while Google Play displays a relatively more balanced landscape of both practical and for-fun apps.
Although there are also many apps designed for fun, the latter appears to have a good number of apps are designed for practical purposes (family, tools, business, lifestyle, productivity, etc.). However, upon further investigation, we discover that the family category (which accounts for almost 19% of the apps) contains mostly games for kids.
Beside the weightings of the genres by number of apps, we would like to identify the kind of apps that brings in the most users.
One way to find out what genres are the most popular (i.e. most users) is to calculate the average number of installs for each app genre. For the Google Play data set, we can find this information in the Installs column, but for the App Store data set this information is missing. As a workaround, we'll take the total number of user ratings as a proxy, which we can find in the rating_count_tot.
Below, we calculate the average number of user ratings per app genre on the App Store:
def average_n_ratings(dataset_1, dataset_2, index_1, index_2):
table = {}
for genre in dataset_1:
total = 0
len_genre = 0
for app in dataset_2:
app_genre = app[index_1]
if app_genre == genre:
n_ratings = float(app[index_2])
total += n_ratings
len_genre += 1
avg_n_rtg = total / len_genre
table[genre] = avg_n_rtg
import operator
sorted_table = sorted(table.items(), key=operator.itemgetter(1), reverse = True)
return sorted_table
ios_anr = average_n_ratings(ios_genres, ios_final, 11, 5)
for entry in ios_anr:
print(entry[0], ':', f'{entry[1]:,.2f}')
Navigation : 86,090.33 Reference : 74,942.11 Social Networking : 72,916.55 Music : 57,326.53 Weather : 52,279.89 Book : 42,816.85 Food & Drink : 33,333.92 Finance : 32,367.03 Travel : 28,964.05 Photo & Video : 28,441.54 Shopping : 26,919.69 Health & Fitness : 23,298.02 Sports : 23,008.90 Games : 22,923.13 News : 21,248.02 Productivity : 21,028.41 Utilities : 18,684.46 Lifestyle : 16,815.48 Entertainment : 14,085.28 Business : 7,491.12 Education : 7,003.98 Catalogs : 4,004.00 Medical : 612.00
On average, navigation apps have the highest number of user reviews, but this figure is heavily influenced by Waze and Google Maps, which have close to half a million user reviews together, skewing the average number:
for app in ios_final:
if app[-5] == 'Navigation':
print(app[1], ':', app[5]) # print name and number of ratings
Waze - GPS Navigation, Maps & Real-time Traffic : 345046 Google Maps - Navigation & Transit : 154911 Geocaching® : 12811 CoPilot GPS – Car Navigation & Offline Maps : 3582 ImmobilienScout24: Real Estate Search in Germany : 187 Railway Route Search : 5
The same pattern applies to:
As such, navigation, reference, social networking or music apps might seem more popular than they really are. The average number of ratings seem to be skewed by very few apps which have hundreds of thousands of user ratings, while the other apps may struggle to get past the 10,000 threshold.
We could get a better picture by removing these extremely popular apps for each genre and then rework the averages, but we'll leave this level of detail for later.
for app in ios_final:
if app[-5] == 'Reference':
print(app[1], ':', app[5]) # print name and number of ratings
Bible : 985920 Dictionary.com Dictionary & Thesaurus : 200047 Dictionary.com Dictionary & Thesaurus for iPad : 54175 Google Translate : 26786 Muslim Pro: Ramadan 2017 Prayer Times, Azan, Quran : 18418 New Furniture Mods - Pocket Wiki & Game Tools for Minecraft PC Edition : 17588 Merriam-Webster Dictionary : 16849 Night Sky : 12122 City Maps for Minecraft PE - The Best Maps for Minecraft Pocket Edition (MCPE) : 8535 LUCKY BLOCK MOD ™ for Minecraft PC Edition - The Best Pocket Wiki & Mods Installer Tools : 4693 GUNS MODS for Minecraft PC Edition - Mods Tools : 1497 Guides for Pokémon GO - Pokemon GO News and Cheats : 826 WWDC : 762 Horror Maps for Minecraft PE - Download The Scariest Maps for Minecraft Pocket Edition (MCPE) Free : 718 VPN Express : 14 Real Bike Traffic Rider Virtual Reality Glasses : 8 教えて!goo : 0 Jishokun-Japanese English Dictionary & Translator : 0
for app in ios_final:
if app[-5] == 'Social Networking':
print(app[1], ':', app[5])
Facebook : 2974676 Pinterest : 1061624 Skype for iPhone : 373519 Messenger : 351466 Tumblr : 334293 WhatsApp Messenger : 287589 Kik : 260965 ooVoo – Free Video Call, Text and Voice : 177501 TextNow - Unlimited Text + Calls : 164963 Viber Messenger – Text & Call : 164249 Followers - Social Analytics For Instagram : 112778 MeetMe - Chat and Meet New People : 97072 We Heart It - Fashion, wallpapers, quotes, tattoos : 90414 InsTrack for Instagram - Analytics Plus More : 85535 Tango - Free Video Call, Voice and Chat : 75412 LinkedIn : 71856 Match™ - #1 Dating App. : 60659 Skype for iPad : 60163 POF - Best Dating App for Conversations : 52642 Timehop : 49510 Find My Family, Friends & iPhone - Life360 Locator : 43877 Whisper - Share, Express, Meet : 39819 Hangouts : 36404 LINE PLAY - Your Avatar World : 34677 WeChat : 34584 Badoo - Meet New People, Chat, Socialize. : 34428 Followers + for Instagram - Follower Analytics : 28633 GroupMe : 28260 Marco Polo Video Walkie Talkie : 27662 Miitomo : 23965 SimSimi : 23530 Grindr - Gay and same sex guys chat, meet and date : 23201 Wishbone - Compare Anything : 20649 imo video calls and chat : 18841 After School - Funny Anonymous School News : 18482 Quick Reposter - Repost, Regram and Reshare Photos : 17694 Weibo HD : 16772 Repost for Instagram : 15185 Live.me – Live Video Chat & Make Friends Nearby : 14724 Nextdoor : 14402 Followers Analytics for Instagram - InstaReport : 13914 YouNow: Live Stream Video Chat : 12079 FollowMeter for Instagram - Followers Tracking : 11976 LINE : 11437 eHarmony™ Dating App - Meet Singles : 11124 Discord - Chat for Gamers : 9152 QQ : 9109 Telegram Messenger : 7573 Weibo : 7265 Periscope - Live Video Streaming Around the World : 6062 Chat for Whatsapp - iPad Version : 5060 QQ HD : 5058 Followers Analysis Tool For Instagram App Free : 4253 live.ly - live video streaming : 4145 Houseparty - Group Video Chat : 3991 SOMA Messenger : 3232 Monkey : 3060 Down To Lunch : 2535 Flinch - Video Chat Staring Contest : 2134 Highrise - Your Avatar Community : 2011 LOVOO - Dating Chat : 1985 PlayStation®Messages : 1918 BOO! - Video chat camera with filters & stickers : 1805 Qzone : 1649 Chatous - Chat with new people : 1609 Kiwi - Q&A : 1538 GhostCodes - a discovery app for Snapchat : 1313 Jodel : 1193 FireChat : 1037 Google Duo - simple video calling : 1033 Fiesta by Tango - Chat & Meet New People : 885 Google Allo — smart messaging : 862 Peach — share vividly : 727 Hey! VINA - Where Women Meet New Friends : 719 Battlefield™ Companion : 689 All Devices for WhatsApp - Messenger for iPad : 682 Chat for Pokemon Go - GoChat : 500 IAmNaughty – Dating App to Meet New People Online : 463 Qzone HD : 458 Zenly - Locate your friends in realtime : 427 League of Legends Friends : 420 Candid - Speak Your Mind Freely : 398 Selfeo : 366 Fake-A-Location Free ™ : 354 Popcorn Buzz - Free Group Calls : 281 Fam — Group video calling for iMessage : 279 QQ International : 274 Ameba : 269 SoundCloud Pulse: for creators : 240 Tantan : 235 Cougar Dating & Life Style App for Mature Women : 213 Rawr Messenger - Dab your chat : 180 WhenToPost: Best Time to Post Photos for Instagram : 158 Inke—Broadcast an amazing life : 147 Mustknow - anonymous video Q&A : 53 CTFxCmoji : 39 Lobi : 36 Chain: Collaborate On MyVideo Story/Group Video : 35 botman - Real time video chat : 7 BestieBox : 0 MATCH ON LINE chat : 0 niconico ch : 0 LINE BLOG : 0 bit-tube - Live Stream Video Chat : 0
for app in ios_final:
if app[-5] == 'Music':
print(app[1], ':', app[5])
Pandora - Music & Radio : 1126879 Spotify Music : 878563 Shazam - Discover music, artists, videos & lyrics : 402925 iHeartRadio – Free Music & Radio Stations : 293228 SoundCloud - Music & Audio : 135744 Magic Piano by Smule : 131695 Smule Sing! : 119316 TuneIn Radio - MLB NBA Audiobooks Podcasts Music : 110420 Amazon Music : 106235 SoundHound Song Search & Music Player : 82602 Sonos Controller : 48905 Bandsintown Concerts : 30845 Karaoke - Sing Karaoke, Unlimited Songs! : 28606 My Mixtapez Music : 26286 Sing Karaoke Songs Unlimited with StarMaker : 26227 Ringtones for iPhone & Ringtone Maker : 25403 Musi - Unlimited Music For YouTube : 25193 AutoRap by Smule : 18202 Spinrilla - Mixtapes For Free : 15053 Napster - Top Music & Radio : 14268 edjing Mix:DJ turntable to remix and scratch music : 13580 Free Music - MP3 Streamer & Playlist Manager Pro : 13443 Free Piano app by Yokee : 13016 Google Play Music : 10118 Certified Mixtapes - Hip Hop Albums & Mixtapes : 9975 TIDAL : 7398 YouTube Music : 7109 Nicki Minaj: The Empire : 5196 Sounds app - Music And Friends : 5126 SongFlip - Free Music Streamer : 5004 Simple Radio - Live AM & FM Radio Stations : 4787 Deezer - Listen to your Favorite Music & Playlists : 4677 Ringtones for iPhone with Ringtone Maker : 4013 Bose SoundTouch : 3687 Amazon Alexa : 3018 DatPiff : 2815 Trebel Music - Unlimited Music Downloader : 2570 Free Music Play - Mp3 Streamer & Player : 2496 Acapella from PicPlayPost : 2487 Coach Guitar - Lessons & Easy Tabs For Beginners : 2416 Musicloud - MP3 and FLAC Music Player for Cloud Platforms. : 2211 Piano - Play Keyboard Music Games with Magic Tiles : 1636 Boom: Best Equalizer & Magical Surround Sound : 1375 Music Freedom - Unlimited Free MP3 Music Streaming : 1246 AmpMe - A Portable Social Party Music Speaker : 1047 Medly - Music Maker : 933 Bose Connect : 915 Music Memos : 909 UE BOOM : 612 LiveMixtapes : 555 NOISE : 355 MP3 Music Player & Streamer for Clouds : 329 Musical Video Maker - Create Music clips lip sync : 320 Cloud Music Player - Downloader & Playlist Manager : 319 Remixlive - Remix loops with pads : 288 QQ音乐HD : 224 Blocs Wave - Make & Record Music : 158 PlayGround • Music At Your Fingertips : 150 Music and Chill : 135 The Singing Machine Mobile Karaoke App : 130 radio.de - Der Radioplayer : 64 Free Music - Player & Streamer for Dropbox, OneDrive & Google Drive : 46 NRJ Radio : 38 Smart Music: Streaming Videos and Radio : 17 BOSS Tuner : 13 PetitLyrics : 0
The fact that the App Store is dominated by for-fun apps might suggests the market may be a bit saturated with for-fun apps. In other words, a practical app might have more of a chance to stand out among the huge number of apps on the App Store.
Now let's analyze the Google Play market a bit.
For the Google Play market, we actually have data about the number of installs, so we should be able to get a clearer picture about genre popularity. However, the install numbers don't seem precise enough — we can see that most values are open-ended (100+, 1,000+, 5,000+, etc.):
for k, v in freq_table(android_final, 5).items():
print(k, ":", v)
10,000+ : 10.212141728729407 500,000+ : 5.574362446400361 5,000,000+ : 6.82690137666441 50,000,000+ : 2.279395170390431 100,000+ : 11.543669600541637 50,000+ : 4.773188896411646 1,000,000+ : 15.730083502595352 10,000,000+ : 10.550665763935905 5,000+ : 4.513653802753328 100,000,000+ : 2.1214172872940646 1,000,000,000+ : 0.22568269013766643 1,000+ : 8.395396073121193 500,000,000+ : 0.2708192281651997 500+ : 3.2498307379823967 100+ : 6.917174452719477 50+ : 1.9183028661701647 10+ : 3.5432182351613632 1+ : 0.5077860528097494 5+ : 0.7898894154818324 0+ : 0.045136538027533285 0 : 0.011284134506883321
Unfortunately, this data is not precise.
For instance, we wouldn't know whether an app with 100,000+ installs has 100,000 installs, 200,000, or 350,000. Nevertheless, we don't need very precise data for our purposes, just an idea which app genres attract the most users, and we don't need perfect precision with respect to the number of users.
We're going to leave the numbers as they are, i.e. we'll consider that an app with 100,000+ installs has 100,000 installs, and an app with 1,000,000+ installs has 1,000,000 installs, and so on.
To perform computations, however, we'll need to convert each install number to float — this means that we need to remove the commas and the plus characters, otherwise the conversion will fail and raise an error. We'll do this directly in the loop below, where we also compute the average number of installs for each genre (category).
def average_n_installs(dataset_1, dataset_2, index_1, index_2):
table = {}
for category in dataset_1:
total = 0
len_category = 0
for app in dataset_2:
app_category = app[index_1]
if app_category == category:
n_installs = app[index_2]
n_installs = n_installs.replace(',', '')
n_installs = n_installs.replace('+', '')
total += float(n_installs)
len_category += 1
avg_n_instl = total / len_category
table[category] = avg_n_instl
import operator
sorted_table = sorted(table.items(), key=operator.itemgetter(1), reverse = True)
return sorted_table
android_ani = average_n_installs(android_categories, android_final, 1, 5)
for entry in android_ani:
print(entry[0], ':', f'{entry[1]:,.2f}')
COMMUNICATION : 38,456,119.17 VIDEO_PLAYERS : 24,852,732.41 SOCIAL : 23,253,652.13 ENTERTAINMENT : 21,134,600.00 PHOTOGRAPHY : 17,805,627.64 PRODUCTIVITY : 16,787,331.34 GAME : 15,837,565.09 TRAVEL_AND_LOCAL : 13,984,077.71 TOOLS : 10,695,245.29 NEWS_AND_MAGAZINES : 9,549,178.47 BOOKS_AND_REFERENCE : 8,767,811.89 SHOPPING : 7,036,877.31 PERSONALIZATION : 5,201,482.61 WEATHER : 5,074,486.20 HEALTH_AND_FITNESS : 4,188,821.99 MAPS_AND_NAVIGATION : 4,056,941.77 SPORTS : 3,638,640.14 EDUCATION : 3,082,017.54 FAMILY : 2,691,618.16 FOOD_AND_DRINK : 1,924,897.74 ART_AND_DESIGN : 1,905,351.67 BUSINESS : 1,712,290.15 LIFESTYLE : 1,437,816.27 FINANCE : 1,387,692.48 HOUSE_AND_HOME : 1,313,681.91 DATING : 854,028.83 COMICS : 817,657.27 AUTO_AND_VEHICLES : 647,317.82 LIBRARIES_AND_DEMO : 638,503.73 PARENTING : 542,603.62 BEAUTY : 513,151.89 EVENTS : 253,542.22 MEDICAL : 120,616.49
On average, communication apps have the most installs: 38,456,119. Once again, this number is heavily skewed up by a few apps that have over one billion installs (WhatsApp, Facebook Messenger, Skype, Google Chrome, Gmail, and Hangouts), and a few others with over 100 and 500 million installs:
comm_apps_tl = 0
for app in android_final:
if app[1] == 'COMMUNICATION':
comm_apps_tl += 1
print(app[0], ':', app[5]) # print name and number of installs
Messenger – Text and Video Chat for Free : 1,000,000,000+ WhatsApp Messenger : 1,000,000,000+ Messenger for SMS : 10,000,000+ Google Chrome: Fast & Secure : 1,000,000,000+ Messenger Lite: Free Calls & Messages : 100,000,000+ Gmail : 1,000,000,000+ Hangouts : 1,000,000,000+ Viber Messenger : 500,000,000+ My Tele2 : 5,000,000+ Firefox Browser fast & private : 100,000,000+ Yahoo Mail – Stay Organized : 100,000,000+ imo beta free calls and text : 100,000,000+ imo free video calls and chat : 500,000,000+ Contacts : 50,000,000+ Call Free – Free Call : 5,000,000+ Web Browser & Explorer : 5,000,000+ Opera Mini - fast web browser : 100,000,000+ Browser 4G : 10,000,000+ MegaFon Dashboard : 10,000,000+ ZenUI Dialer & Contacts : 10,000,000+ Cricket Visual Voicemail : 10,000,000+ Opera Browser: Fast and Secure : 100,000,000+ TracFone My Account : 1,000,000+ Firefox Focus: The privacy browser : 1,000,000+ Google Voice : 10,000,000+ Chrome Dev : 5,000,000+ Xperia Link™ : 10,000,000+ TouchPal Keyboard - Fun Emoji & Android Keyboard : 10,000,000+ Who : 100,000,000+ Skype Lite - Free Video Call & Chat : 5,000,000+ WeChat : 100,000,000+ UC Browser Mini -Tiny Fast Private & Secure : 100,000,000+ WhatsApp Business : 10,000,000+ My magenta : 1,000,000+ Android Messages : 100,000,000+ Telegram : 100,000,000+ Google Duo - High Quality Video Calls : 500,000,000+ Puffin Web Browser : 10,000,000+ Seznam.cz : 1,000,000+ Antillean Gold Telegram (original version) : 100,000+ AT&T Visual Voicemail : 10,000,000+ GMX Mail : 10,000,000+ Omlet Chat : 10,000,000+ UC Browser - Fast Download Private & Secure : 500,000,000+ My Vodacom SA : 5,000,000+ Microsoft Edge : 5,000,000+ Hangouts Dialer - Call Phones : 10,000,000+ Talkatone: Free Texts, Calls & Phone Number : 10,000,000+ Calls & Text by Mo+ : 5,000,000+ free video calls and chat : 50,000,000+ Skype - free IM & video calls : 1,000,000,000+ GO SMS Pro - Messenger, Free Themes, Emoji : 100,000,000+ Messaging+ SMS, MMS Free : 1,000,000+ chomp SMS : 10,000,000+ Glide - Video Chat Messenger : 10,000,000+ Text SMS : 10,000,000+ Google Allo : 10,000,000+ Talkray - Free Calls & Texts : 10,000,000+ LINE: Free Calls & Messages : 500,000,000+ GroupMe : 10,000,000+ mysms SMS Text Messaging Sync : 1,000,000+ BBM - Free Calls & Messages : 100,000,000+ KakaoTalk: Free Calls & Text : 100,000,000+ 2ndLine - Second Phone Number : 1,000,000+ CM Browser - Ad Blocker , Fast Download , Privacy : 50,000,000+ Ninesky Browser : 1,000,000+ Dolphin Browser - Fast, Private & Adblock🐬 : 50,000,000+ Ghostery Privacy Browser : 1,000,000+ InBrowser - Incognito Browsing : 1,000,000+ Web Browser for Android : 1,000,000+ DU Browser—Browse fast & fun : 10,000,000+ Lightning Web Browser : 500,000+ Web Browser : 500,000+ Contacts+ : 10,000,000+ ExDialer - Dialer & Contacts : 10,000,000+ PHONE for Google Voice & GTalk : 1,000,000+ Safest Call Blocker : 1,000,000+ Full Screen Caller ID : 5,000,000+ Hiya - Caller ID & Block : 10,000,000+ Mr. Number-Block calls & spam : 10,000,000+ Should I Answer? : 1,000,000+ RocketDial Dialer & Contacts : 1,000,000+ CallApp: Caller ID, Blocker & Phone Call Recorder : 10,000,000+ Whoscall - Caller ID & Block : 10,000,000+ CIA - Caller ID & Call Blocker : 5,000,000+ Calls Blacklist - Call Blocker : 10,000,000+ Call Control - Call Blocker : 5,000,000+ True Contact - Real Caller ID : 1,000,000+ Video Caller Id : 1,000,000+ Sync.ME – Caller ID & Block : 5,000,000+ Burner - Free Phone Number : 1,000,000+ Truecaller: Caller ID, SMS spam blocking & Dialer : 100,000,000+ Caller ID + : 1,000,000+ K-9 Mail : 5,000,000+ myMail – Email for Hotmail, Gmail and Outlook Mail : 10,000,000+ Email TypeApp - Mail App : 1,000,000+ All Email Providers : 1,000,000+ Newton Mail - Email App for Gmail, Outlook, IMAP : 1,000,000+ GO Notifier : 10,000,000+ Mail.Ru - Email App : 50,000,000+ Mail1Click - Secure Mail : 10,000+ Daum Mail - Next Mail : 5,000,000+ mail.com mail : 1,000,000+ SolMail - All-in-One email app : 500,000+ Vonage Mobile® Call Video Text : 1,000,000+ JusTalk - Free Video Calls and Fun Video Chat : 5,000,000+ Azar : 50,000,000+ LokLok: Draw on a Lock Screen : 500,000+ Discord - Chat for Gamers : 10,000,000+ AntennaPict β : 1,000,000+ Kik : 100,000,000+ K-@ Mail - Email App : 100,000+ K-9 Material (unofficial) : 5,000+ M star Dialer : 100,000+ Free WiFi Connect : 10,000,000+ m:go BiH : 10,000+ N-Com Wizard : 50,000+ Opera Mini browser beta : 10,000,000+ Psiphon Pro - The Internet Freedom VPN : 10,000,000+ ICQ — Video Calls & Chat Messenger : 10,000,000+ AT&T Messages for Tablet : 1,000,000+ T-Mobile DIGITS : 100,000+ Portable Wi-Fi hotspot : 10,000,000+ AT&T Call Protect : 5,000,000+ U - Webinars, Meetings & Messenger : 500,000+ /u/app : 10,000+ [verify-U] VideoIdent : 10,000+ WhatsCall Free Global Phone Call App & Cheap Calls : 10,000,000+ X Browser : 50,000+ Free Adblocker Browser - Adblock & Popup Blocker : 10,000,000+ Adblock Browser for Android : 10,000,000+ Adblock Plus for Samsung Internet - Browse safe. : 1,000,000+ Ad Blocker Turbo - Adblocker Browser : 10,000+ Brave Browser: Fast AdBlocker : 5,000,000+ AG Contacts, Lite edition : 5,000+ Oklahoma Ag Co-op Council : 10+ Bee'ah Employee App : 100+ tournaments and more.aj.2 : 100+ Aj.Petra : 100+ AK Phone : 5,000+ PlacarTv Futebol Ao Vivo : 100,000+ WiFi Access Point (hotspot) : 100,000+ Access Point Names : 10,000+ ClanHQ : 10,000+ Ear Agent: Super Hearing : 5,000,000+ AU Call Blocker - Block Unwanted Calls Texts 2018 : 1,000+ Baby Monitor AV : 100,000+ AV Phone : 1,000+ AW - free video calls and chat : 1,000,000+ Katalogen.ax : 100+ AZ Browser. Private & Download : 100,000+ BA SALES : 1+ BD Data Plan (3G & 4G) : 500,000+ BD Internet Packages (Updated) : 50,000+ BD Dialer : 10,000+ BD Live Call : 5,000+ Best Browser BD social networking : 10+ Traffic signs BD : 500+ BF Browser by Betfilter - Stop Gambling Today! : 10,000+ My BF App : 50,000+ BH Mail : 1,000+ Zalo – Video Call : 50,000,000+ BJ - Confidential : 10+ BK Chat : 1,000+ Of the wall Arapaho bk : 5+ AC-BL : 50+ DMR BrandMeister Tool : 10,000+ BBMoji - Your personalized BBM Stickers : 1,000,000+ BN MALLORCA Radio : 1,000+ BQ Partners : 1,000+ BS-Mobile : 50+ ATC Unico BS : 500+ BT One Voice mobile access : 5,000+ BT Messenger : 50,000+ BT One Phone Mobile App : 10,000+ SW-100.tch by Callstel : 1,000,000+ BT MeetMe with Dolby Voice : 100,000+ Bluetooth Auto Connect : 5,000,000+ AudioBT: BT audio GPS/SMS/Text : 50,000+ BV : 100+ Feel Performer : 10,000+ Tiny Call Confirm : 1,000,000+ CB Radio Chat - for friends! : 1,000,000+ CB On Mobile : 100,000+ Virtual Walkie Talkie : 1,000,000+ Channel 19 : 100,000+ Cb browser : 50+ CF Chat: Connecting Friends : 100+ retteMi.ch : 5,000+ CJ Browser - Fast & Private : 100+ CJ DVD Rentals : 100+ CK Call NEW : 10+ CM Transfer - Share any files with friends nearby : 5,000,000+ mail.co.uk Mail : 5,000+ ClanPlay: Community and Tools for Gamers : 1,000,000+ CQ-Mobile : 1,000+ CQ-Alert : 500+ QRZ Assistant : 100,000+ Pocket Prefix Plus : 10,000+ Ham Radio Prefixes : 10,000+ CS Customizer : 1,000+ CS Browser | #1 & BEST BROWSER : 1,000+ CS Browser Beta : 5,000+ My Vodafone (GR) : 1,000,000+ IZ2UUF Morse Koch CW : 50,000+ C W Browser : 100+ CW Bluetooth SPP : 100+ CW BLE Peripheral Simulator : 500+ Morse Code Reader : 100,000+ Learn Morse Code - G0HYN Learn Morse : 5,000+ Ring : 10,000+ Hyundai CX Conference : 50+ Cy Messenger : 100+ Amadeus GR & CY : 100+ Hlášenírozhlasu.cz : 10+ SMS Sender - sluzba.cz : 1,000+ WEB.DE Mail : 10,000,000+ Your Freedom VPN Client : 5,000,000+ Rádio Sol Nascente DF : 500+ DG Card : 100+ DK Browser : 10+ cluster.dk : 1,000+ DK TEL Dialer : 50+ DM for WhatsApp : 5,000+ DM Talk New : 5,000+ DM - The Offical Messaging App : 10+ DM Tracker : 1,000+ Call Blocker & Blacklist : 1,000+ ReadyOp DT : 1,000+ Caller ID & Call Block - DU Caller : 5,000,000+ BlueDV AMBE : 1,000+ DW Contacts & Phone & Dialer : 1,000,000+ Deaf World DW : 10,000+ Ham DX Cluster & Spots Finder : 5,000+ Mircules DX Cluster Lite : 5,000+ 3G DZ Configuration : 50,000+ chat dz : 100+ love sms good morning : 5,000+ Goodbox - Mega App : 100,000+ Call Blocker - Blacklist, SMS Blocker : 1,000,000+ [EF]ShoutBox : 100+ Eg Call : 10,000+ ei : 10+ EJ messenger : 10+ Ek IRA : 10+ Orfox: Tor Browser for Android : 10,000,000+ EO Mumbai : 10+ EP RSS Reader : 100+ Voxer Walkie Talkie Messenger : 10,000,000+ ES-1 : 500+ EU Council : 1,000+ Council Voting Calculator : 5,000+ Have your say on Europe : 500+ Programi podrške EU : 100+ Inbox.eu : 10,000+ Everbridge : 100,000+ Best Auto Call Recorder Free : 500+ EZ Wifi Notification : 10,000+ Test Server SMS FA : 5+ Lite for Facebook Messenger : 1,000,000+ FC Browser - Focus Privacy Browser : 1,000+ EHiN-FH conferenceapp : 100+ Carpooling FH Hagenberg : 100+ Wi-Fi Auto-connect : 1,000,000+ Talkie - Wi-Fi Calling, Chats, File Sharing : 500,000+ WeFi - Free Fast WiFi Connect & Find Wi-Fi Map : 1,000,000+ Sat-Fi : 5,000+ Portable Wi-Fi hotspot Free : 100,000+ TownWiFi | Wi-Fi Everywhere : 500,000+ Jazz Wi-Fi : 10,000+ Sat-Fi Voice : 1,000+ Free Wi-fi HotspoT : 50,000+ FN Web Radio : 10+ FNH Payment Info : 10+ MARKET FO : 100+ FO OP St-Nazaire : 100+ FO SODEXO : 100+ FO RCBT : 100+ FO Interim : 100+ FO PSA Sept-Fons : 100+ FO AIRBUS TLSE : 1,000+ FO STELIA Méaulte : 100+ FO AIRBUS Nantes : 100+ FP Connect : 100+ FreedomPop Messaging Phone/SIM : 500,000+ FP Live : 10+ HipChat - beta version : 50,000+
print("Total number of apps under the 'Communication' category: ", comm_apps_tl)
Total number of apps under the 'Communication' category: 287
If we removed all the communication apps that have over 100 million installs, the average would be reduced roughly ten times:
under_100m = []
for app in android_final:
n_installs = app[5]
n_installs = n_installs.replace(',', '')
n_installs = n_installs.replace('+', '')
if (app[1] == 'COMMUNICATION') and (float(n_installs) < 100000000):
under_100m.append(float(n_installs))
print(f'{(sum(under_100m) / len(under_100m)):,.2f}', '\n')
print("Number of 'Communication' apps with under 100 million installs: ", len(under_100m))
3,603,485.39 Number of 'Communication' apps with under 100 million installs: 260
We see the same pattern for:
for app in android_final:
if app[1] == 'VIDEO_PLAYERS':
print(app[0], ':', app[5])
YouTube : 1,000,000,000+ All Video Downloader 2018 : 1,000,000+ Video Downloader : 10,000,000+ HD Video Player : 1,000,000+ Iqiyi (for tablet) : 1,000,000+ Motorola FM Radio : 100,000,000+ Video Player All Format : 10,000,000+ Motorola Gallery : 100,000,000+ Free TV series : 100,000+ Video Player All Format for Android : 500,000+ VLC for Android : 100,000,000+ Code : 10,000,000+ Vote for : 50,000,000+ XX HD Video downloader-Free Video Downloader : 1,000,000+ OBJECTIVE : 1,000,000+ Music - Mp3 Player : 10,000,000+ HD Movie Video Player : 1,000,000+ YouCut - Video Editor & Video Maker, No Watermark : 5,000,000+ Video Editor,Crop Video,Movie Video,Music,Effects : 1,000,000+ YouTube Studio : 10,000,000+ video player for android : 10,000,000+ Vigo Video : 50,000,000+ Google Play Movies & TV : 1,000,000,000+ HTC Service - DLNA : 10,000,000+ VPlayer : 1,000,000+ MiniMovie - Free Video and Slideshow Editor : 50,000,000+ Samsung Video Library : 50,000,000+ OnePlus Gallery : 1,000,000+ LIKE – Magic Video Maker & Community : 50,000,000+ HTC Service—Video Player : 5,000,000+ Play Tube : 1,000,000+ Droid Zap by Motorola : 5,000,000+ video player : 1,000,000+ G Guide Program Guide (SOFTBANK EMOBILE WILLCOM version) : 1,000,000+ Video.Guru - Video Maker : 1,000,000+ HTC Gallery : 10,000,000+ PowerDirector Video Editor App: 4K, Slow Mo & More : 10,000,000+ Cartoon Network App : 10,000,000+ MX Player : 500,000,000+ Video Status : 1,000,000+ Video Wallpaper Show : 500+ SVT Play : 1,000,000+ BluTV : 1,000,000+ Tencent Video - Supporting the whole network : 1,000,000+ Casper Ssinema : 10,000+ amazer - Global Kpop Video Community : 100,000+ Omlet Arcade - Stream, Meet, Play : 10,000,000+ VUE: video editor & camcorder : 1,000,000+ Magisto Video Editor & Maker : 10,000,000+ Dubsmash : 100,000,000+ DU Recorder – Screen Recorder, Video Editor, Live : 50,000,000+ KineMaster – Pro Video Editor : 50,000,000+ Mobizen Screen Recorder for SAMSUNG : 10,000,000+ Mobizen Screen Recorder for LG - Record, Capture : 1,000,000+ M-Sight Pro : 5,000+ Sketch 'n' go : 100,000+ Q-See Plus : 5,000+ Ustream : 10,000,000+ VMate : 50,000,000+ All Video Downloader : 10,000,000+ VidPlay : 1,000,000+ HD Video Downloader : 2018 Best video mate : 50,000,000+ VivaVideo - Video Editor & Photo Movie : 100,000,000+ VideoShow-Video Editor, Video Maker, Beauty Camera : 100,000,000+ W Box VMS : 10,000+ W Box VMS HD : 5,000+ AB Repeat Player : 100,000+ A-B repeater : 5,000+ Ez Screen Recorder (no ad) : 100,000+ Adobe Premiere Clip : 5,000,000+ FilmoraGo - Free Video Editor : 10,000,000+ ActionDirector Video Editor - Edit Videos Fast : 5,000,000+ AJ Player : 100+ AK Lodi Films : 100+ WiFi Baby Monitor - NannyCam : 5,000,000+ Ringdroid : 50,000,000+ Multiple Videos at Same Time : 1,000,000+ AV-IPTV : 1,000+ HD Video Player (wmv,avi,mp4,flv,av,mpg,mkv)2017 : 10,000+ HD Video Player - Video & MP3 Player | AV Player | : 5,000+ EML UPnP-AV Control Point : 10,000+ AW Screen Recorder No Root : 100,000+ AX Player -Nougat Video Player : 1,000,000+ AX Video Player : 50,000+ Ay : 5,000+ Ay Sabz Gunbad Waly : 1,000+ iMediaShare – Photos & Music : 10,000,000+ AZ Screen Recorder - No Root : 10,000,000+ Movie Downloader Torrent : Az Torrent : 1,000+ A-Z Screen Recorder - : 500+ BC iptv player : 1,000+ Bc Vod : 100+ Funny videos for whatsapp : 1,000,000+ BG video - floating video - background video : 5,000+ BG MUSIC PLAYER - MUSIC PLAYER : 100+ bgtime.tv : 50,000+ YourTube Video Views BG : 500+ Music for Youtube - Tube Music BG, Red+ : 1,000+ BGCN TV : 100,000+ AfreecaTV : 10,000,000+ BK News Channel : 10,000+ BR Video Player : 5,000+ BR Series : 50,000+ CINE BR : 1,000+ iPlayIT for YouTube VR Player : 1,000,000+ BSPlayer FREE : 10,000,000+ BSPlayer ARMv7 VFP CPU support : 1,000,000+ BS player remote : 10,000+ BitTorrent®- Torrent Downloads : 10,000,000+ Bx-WiFi-GI : 100+ BZ Langenthaler Tagblatt : 1,000+ Nero AirBurn : 100,000+ CI Stream : 10+ CJ Camcorder : 500+ CJ VLC HD Remote (+ Stream) : 500,000+ ACTIVEON CX & CX GOLD : 50,000+ CX Monthly Tech News : 500+ DG UPnP Player Free : 10,000+ DG Screen Recorder : 500+ DG Video Editor : 10,000+ Video Downloader - for Instagram Repost App : 10,000,000+ Quik – Free Video Editor for photos, clips, music : 10,000,000+ FrostWire: Torrent Downloader & Music Player : 10,000,000+ Inst Download - Video & Photo : 10,000,000+ Vuze Torrent Downloader : 1,000,000+ AndStream - Streaming Download : 1,000,000+ DR TV : 500,000+ DS photo : 1,000,000+ DS video : 1,000,000+ DU Privacy-hide apps、sms、file : 1,000,000+ iSmart DV : 1,000,000+ dv Prompter : 50,000+ DV Lottery Photo : 5,000+ MelifeCam-M : 10,000+ GoPlus Cam : 500,000+ GoAction : 100,000+ 4K VIDEO PLAYER ULTRA HD : 5,000+ Downvids Helper - One touch DW : 10,000+ DZ Popup Video Player : 5,000+ EC MANAGER : 100+ EF Sidekick : 5,000+ ek tuhi : 10,000+ Naruto Shippuden - Watch Free! : 10,000,000+ ES Audio Player ( Shortcut ) : 100,000+ Furrion ES Control : 5,000+ ES-IPTV : 50,000+ EZCast – Cast Media to TV : 1,000,000+ EZ Web Video Cast | Chromecast : 100,000+ EZ-SEE : 10,000+ EZ TV Player : 10,000+ EZ Usenet for Easynews® : 10,000+ EZ game screen recorder with audio 1080P : 1,000+ Videos downloader for Facebook:fast fb video saver : 5,000+ Video Downloader for FB : Video Download with Link : 100,000+ HD VideoDownlaoder For Fb : XXVideo Downloader : 10,000+ HD Video Download for Facebook : 1,000,000+ Art of F J Taylor : 10+ List iptv FR : 100+
The main concern is that these app genres might seem more popular than they really are.
But concerns aside, and notwithstanding the fact that they're dominated by a few giants who are hard to compete against, there might still be ways to break into these niches by thinking out of the box.
Take YouTube for example. Challenging its reign is a new contender Odysee, a new video site launched in December 2020, and created to provide an alternative as "the internet has become “very corporate” with a small number of companies controlling the flow of information". The site was created by the team behind the Lbry (pronounced “library”) blockchain protocol. Link
The game genre is quite popular, albeit a bit saturated, so we'd like to come up with a different app recommendation if possible.
And by the same logic, if the genre is not overcrowded (i.e. lower weighting) but enjoys relatively higher number of reviews or installs, this could potentially be fertile field to design an app.
Let's compare the last half genres/categories by sorted weighting frequency table and the first half genres of the reviews/installs frequency table.
for entry in ios_gen_ft_s[-12:]:
print(entry[0], ":", round(entry[1], 3))
print()
for entry in ios_anr[:12]:
print(entry[0], ":", f'{entry[1]:,.2f}')
Lifestyle : 1.561 News : 1.342 Travel : 1.217 Finance : 1.092 Weather : 0.874 Food & Drink : 0.811 Reference : 0.562 Business : 0.531 Book : 0.406 Navigation : 0.187 Medical : 0.187 Catalogs : 0.125 Navigation : 86,090.33 Reference : 74,942.11 Social Networking : 72,916.55 Music : 57,326.53 Weather : 52,279.89 Book : 42,816.85 Food & Drink : 33,333.92 Finance : 32,367.03 Travel : 28,964.05 Photo & Video : 28,441.54 Shopping : 26,919.69 Health & Fitness : 23,298.02
For Apple Store, 'Travel', 'Finance', 'Weather', 'Food & Drink', 'Reference', 'Book' and 'Navigation' are the overlapping genres.
Other than 'Reference' (dominated by Bible and dictionaries), 'Navigation' (dominated by Waze and Google Map) and 'Weather', the other genres could be the "blue oceans" with potentially interesting and creative content that draw more users.
for entry in android_cat_ft_s[-16:]:
print(entry[0], ":", round(entry[1], 3))
print()
for entry in android_ani[0:16]:
print(entry[0], ":", f'{entry[1]:,.2f}')
BOOKS_AND_REFERENCE : 2.144 DATING : 1.862 VIDEO_PLAYERS : 1.783 MAPS_AND_NAVIGATION : 1.399 EDUCATION : 1.286 FOOD_AND_DRINK : 1.241 ENTERTAINMENT : 1.128 LIBRARIES_AND_DEMO : 0.937 AUTO_AND_VEHICLES : 0.925 HOUSE_AND_HOME : 0.835 WEATHER : 0.801 EVENTS : 0.711 ART_AND_DESIGN : 0.677 PARENTING : 0.654 COMICS : 0.621 BEAUTY : 0.598 COMMUNICATION : 38,456,119.17 VIDEO_PLAYERS : 24,852,732.41 SOCIAL : 23,253,652.13 ENTERTAINMENT : 21,134,600.00 PHOTOGRAPHY : 17,805,627.64 PRODUCTIVITY : 16,787,331.34 GAME : 15,837,565.09 TRAVEL_AND_LOCAL : 13,984,077.71 TOOLS : 10,695,245.29 NEWS_AND_MAGAZINES : 9,549,178.47 BOOKS_AND_REFERENCE : 8,767,811.89 SHOPPING : 7,036,877.31 PERSONALIZATION : 5,201,482.61 WEATHER : 5,074,486.20 HEALTH_AND_FITNESS : 4,188,821.99 MAPS_AND_NAVIGATION : 4,056,941.77
On the other hand, the overlapping categories for Google Play are 'BOOKS_AND_REFERENCE', 'VIDEO_PLAYERS', 'ENTERTAINMENT', 'MAPS_AND_NAVIGATION' and 'WEATHER'.
Putting the last two aside for the time being, 'BOOKS_AND_REFERENCE' and 'ENTERTAINMENT' seem to coincide with 'Travel', 'Food & Drink' (should these two be considered as a form of entertainment) and 'Book' of the Apple Store.
It's interesting to explore these genres in more depth, as they have potential for being profitable on both the Apple Store and Google Play.
Quantitatively, based on the data analyses above, the genre to explore first would most likely be 'books'. Qualitatively, we may have to dig deeper, perhaps into the reviews itself to discover other hidden gemsof insights.
In this project, we analyzed data about the App Store and Google Play mobile apps with the goal of recommending an app profile that can be profitable for both markets.
We concluded that the 'book' genre is definitely worth exploring deeper, perhaps taking a popular book (perhaps a more recent book) and turning it into an app could be profitable for both the Google Play and the App Store markets. The markets are already full of libraries, so we need to add some special features besides the raw version of the book. This might include daily quotes from the book, an audio version of the book, quizzes on the book, a forum where people can discuss the book, etc.