As a company that focuses on building free mobile apps for both the IOS Apple Store & the Google Play Store our revenue comes from users watching the ads. Hence, the more the users the greater the revenue.
Next is an analysis of a number of apps for both platforms which concludes what type (Category) of apps attracts the most users and based on our conclusion the company will decide to what category ,the next app they are going to build, belongs.
You can view and download the Google apps data set from here
You can view and download the IOS apps data set from here
Please note that the data was collected in 2017/2018
#Open IOS apps file
from csv import reader
opened_file = open('AppleStore.csv')
read_file = reader(opened_file)
apps_ios = list(read_file)
ios_header = apps_ios[0]
apps_ios = apps_ios[1:]
#Open Google apps file
Gopened_file = open('googleplaystore.csv')
Gread_file = reader(Gopened_file)
apps_google = list(Gread_file)
google_header = apps_google[0]
apps_google = apps_google[1:]
To make it easier to explore our data easily we will build a function that prints the desired row/s based on the input to the function:
def explore_data(dataset,start,end,printrnc=False):
data_slice = dataset[start:end]
for row in data_slice:
print(row)
print('\n')
if printrnc:
print("number of rows = " ,(len(dataset)))
print('number of columns =' ,(len(dataset[0])))
The explore_data()
function:
Please note dataset shouldn't have a header row, otherwise the function will print the wrong number of rows (one more row compared to the actual length).
print(ios_header,'\n')
explore_data(apps_ios,5,10,True)
['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic'] ['429047995', 'Pinterest', '74778624', 'USD', '0.0', '1061624', '1814', '4.5', '4.0', '6.26', '12+', 'Social Networking', '37', '5', '27', '1'] ['282935706', 'Bible', '92774400', 'USD', '0.0', '985920', '5320', '4.5', '5.0', '7.5.1', '4+', 'Reference', '37', '5', '45', '1'] ['553834731', 'Candy Crush Saga', '222846976', 'USD', '0.0', '961794', '2453', '4.5', '4.5', '1.101.0', '4+', 'Games', '43', '5', '24', '1'] ['324684580', 'Spotify Music', '132510720', 'USD', '0.0', '878563', '8253', '4.5', '4.5', '8.4.3', '12+', 'Music', '37', '5', '18', '1'] ['343200656', 'Angry Birds', '175966208', 'USD', '0.0', '824451', '107', '4.5', '3.0', '7.4.0', '4+', 'Games', '38', '0', '10', '1'] number of rows = 7197 number of columns = 16
We can see from the headers that the fields we are interested in in our analysis are: track_name, price, rating_count_tot, user_rating and prime_genre
print(google_header, '\n')
explore_data(apps_google,5,10,True)
['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver'] ['Paper flowers instructions', 'ART_AND_DESIGN', '4.4', '167', '5.6M', '50,000+', 'Free', '0', 'Everyone', 'Art & Design', 'March 26, 2017', '1.0', '2.3 and up'] ['Smoke Effect Photo Maker - Smoke Editor', 'ART_AND_DESIGN', '3.8', '178', '19M', '50,000+', 'Free', '0', 'Everyone', 'Art & Design', 'April 26, 2018', '1.1', '4.0.3 and up'] ['Infinite Painter', 'ART_AND_DESIGN', '4.1', '36815', '29M', '1,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'June 14, 2018', '6.1.61.1', '4.2 and up'] ['Garden Coloring Book', 'ART_AND_DESIGN', '4.4', '13791', '33M', '1,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'September 20, 2017', '2.9.2', '3.0 and up'] ['Kids Paint Free - Drawing Fun', 'ART_AND_DESIGN', '4.7', '121', '3.1M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design;Creativity', 'July 3, 2018', '2.8', '4.0.3 and up'] number of rows = 10841 number of columns = 13
We can see from the headers that the fields we are interested in are: App, Category, Rating, Reviews, Installs and Price.
The first and ,in my opinion, the most important step to do when analyzing data is to clean the data set at hand.
In the data cleaning procedure we will go through 4 steps:
The Google playstore data set has a discussion panel and you can find here that it was found out that entry for row 10472 has the 'Category' field null which caused a shift in the rest of the columns, this will lead to wrong calculations if used, therefore, we have to delete it.
It's important to note that detecting nulls is an important procedure during the data cleaning routine, the next piece of code detects the presence of null fields in data sets and confirms that row 10472 in the Google apps data set does have a null field as mentioned in the discussion panel.
def find_nulls(dataset):
if dataset is apps_google:
header = google_header
elif dataset is apps_ios:
header = ios_header
else:
print('Please enter function parameter only as "apps_google" OR "apps_ios"')
has_nulls = []
count = 0
for row in dataset:
if len(row) < len(header):
index = dataset.index(row)
count += 1
has_nulls.append(index)
print('row ', index, ' has a null field')
print(count, ' row(s) with null field(s) found')
print('Checking nulls in Google data set')
find_nulls(apps_google)
print('\n')
print('Checking nulls in IOS data set')
find_nulls(apps_ios)
Checking nulls in Google data set row 10472 has a null field 1 row(s) with null field(s) found Checking nulls in IOS data set 0 row(s) with null field(s) found
From the code above we can confirm that row 10472 in the Google data set does have a null field, lets print it and compare it to the header and a correct row so we can see for ourselves:
print(google_header,'\n')
explore_data(apps_google,10472,10474) #10472 incorrect, 10473 correct
['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver'] ['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up'] ['osmino Wi-Fi: free WiFi', 'TOOLS', '4.2', '134203', '4.1M', '10,000,000+', 'Free', '0', 'Everyone', 'Tools', 'August 7, 2018', '6.06.14', '4.4 and up']
You can notice that 'Life Made WI-Fi Touchscreen Photo Frame' app has it's Category field missing (null) and the rest of the fields are shifted to the left.
As a step in our data cleaning process lets delete this wrong entry.
Please note that the following code should only be executed once to avoid deleting healthy rows
del apps_google[10472] #run only once
The second step in our data cleaning process is removing duplicate rows, the next piece of code will run through the datasets and check if there are any instances of duplication
def duplicate_rows(dataset):
duplicate_apps = []
unique_apps = []
for row in dataset:
if row[0] in unique_apps:
duplicate_apps.append(row[0])
else:
unique_apps.append(row[0])
print ('Number of duplicated rows = ',len(duplicate_apps))
print ('Number of unique rows = ',len(unique_apps))
return duplicate_apps
print('Checking Google data set')
duplicate_rows(apps_google)
print('\n')
print('Checking IOS data set')
duplicate_rows(apps_ios)
Checking Google data set Number of duplicated rows = 1181 Number of unique rows = 9659 Checking IOS data set Number of duplicated rows = 0 Number of unique rows = 7197
[]
From the code above we find out that the google apps data set has 1181 duplicated rows which needs to be removed, lets show an example of a duplicated app
duplicate_apps = duplicate_rows(apps_google)
for i in duplicate_apps:
if i == duplicate_apps[777]: #A random sample
print(i)
Number of duplicated rows = 1181 Number of unique rows = 9659 Google News Google News Google News
From the code above we notice that the 'Google News' app has 4 entries in our data set (1 unique and 3 duplicated), lets examine those 4 entries:
print(google_header,'\n')
for row in apps_google:
if row[0] == 'Google News':
print (row,'\n')
['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver'] ['Google News', 'NEWS_AND_MAGAZINES', '3.9', '877635', '13M', '1,000,000,000+', 'Free', '0', 'Teen', 'News & Magazines', 'August 1, 2018', '5.2.0', '4.4 and up'] ['Google News', 'NEWS_AND_MAGAZINES', '3.9', '877635', '13M', '1,000,000,000+', 'Free', '0', 'Teen', 'News & Magazines', 'August 1, 2018', '5.2.0', '4.4 and up'] ['Google News', 'NEWS_AND_MAGAZINES', '3.9', '877643', '13M', '1,000,000,000+', 'Free', '0', 'Teen', 'News & Magazines', 'August 1, 2018', '5.2.0', '4.4 and up'] ['Google News', 'NEWS_AND_MAGAZINES', '3.9', '878065', '13M', '1,000,000,000+', 'Free', '0', 'Teen', 'News & Magazines', 'August 1, 2018', '5.2.0', '4.4 and up']
By examining the results we notice that all the fields are similar except for the 'Reviews' column, we want to make sure that when we delete duplicate rows we only keep the entry with the highest number of reviews because it corresponds to the latest(last updated) entry of the app.
The next piece of code will run through the whole data set and for duplicated apps it will only store the highest number of reviews for each app.
reviews_dict = {}
for row in apps_google:
if row[0] in reviews_dict:
max_review = reviews_dict[row[0]]
review_num = row[3]
if review_num > max_review:
max_review = review_num
reviews_dict[row[0]] = max_review
else:
reviews_dict[row[0]] = row[3]
print(reviews_dict['Google News'])
print(len(reviews_dict))
878065 9659
We ended up having a dictionary with the app name as key and the highest number of reviews as value, you can notice we tested it on the Google News app and if you compare with yourself the result with the 4 entries of the app you can confirm that it does store only the highest reviews value.
Also note that the dictionary's length is the same as the number of unique apps deduced earlier
Next we are going to remove the duplicates, there are two points to keep in mind while doing this:
print('Google data set length before duplicates removal',len(apps_google))
apps_google_clean = []
apps_names = []
for k,v in reviews_dict.items(): #looping with keys and values
for row in apps_google:
name = row[0]
reviews = row [3]
if k == name:
if v == reviews:
if name not in apps_names:
apps_google_clean.append(row)
apps_names.append(name)
print('\nGoogle data set length after duplicates removal',len(apps_google_clean))
Google data set length before duplicates removal 10840 Google data set length after duplicates removal 9659
Note that we ended with Google data set length of 9659 as expected, which is the same number of unique apps we deduced beore in an earlier code
As a company that is interested in developing free apps for English-speaking audience we need to remove apps that wont be useful to our target audience.
def english_check(a_string):
for char in a_string:
if ord(char) > 127:
return False
return True
The previous function takes in a string value and checks the ASCII order of its characters, if they belong to the normal ASCII range (0-127) then it is an english word and if they belong to the extended ASCII range (greater than 127) then it's not an english word.
Below is some examples from our data sets.
ex1 = apps_google[786][0]
print(ex1,'\n',english_check(ex1),'\n')
ex2 = apps_google[123][0]
print(ex2,'\n',english_check(ex2),'\n')
ex3 = apps_ios[813][1]
print(ex3,'\n',english_check(ex3),'\n')
ex4 = apps_ios[123][1]
print(ex4,'\n',english_check(ex4),'\n')
Coursera: Online courses True Manicure - nail design True 爱ه¥‡è‰؛PPS -م€ٹو¬¢ن¹گ颂2م€‹ç”µè§†ه‰§çƒو’ False Evernote - stay organized True
However Notice that in the following example although it's an english application, the presence of a single non_english character in it's name resulted in our function detecting it as a non-english app (which is not true)
ex5 = apps_ios[876][1]
print(ex5,'\n',english_check(ex5))
Artisto – Video and Photo Editor with Art Filters False
Thus using the english_check()
function in it's current form will cause alot of data loss (filtering) in our data set.
Next we will modify our function to only filter an app if it contains at least 4 non-english characters in it's name which will minimize the impact of data loss.
def english_check(a_string):
count = 0
for char in a_string:
if ord(char) > 127:
count +=1
if count > 3:
return False
else:
return True
ex5 = apps_ios[876][1] #running the same example that returned False in the previous code
print(ex5,'\n',english_check(ex5))
Artisto – Video and Photo Editor with Art Filters True
Now let's use our modified function to update our data sets by deleting non-english apps
print('Google data set length was ',len(apps_google_clean))
print('Apple data set length was ',len(apps_ios),'\n')
apps_google_english=[]
apps_ios_english=[]
for row in apps_google_clean:
name = row[0]
if english_check(name):
apps_google_english.append(row)
for row in apps_ios:
name = row[1]
if english_check(name):
apps_ios_english.append(row)
print('Google data set length is now ',len(apps_google_english))
print('Apple data set length is now ',len(apps_ios_english))
Google data set length was 9659 Apple data set length was 7197 Google data set length is now 9500 Apple data set length is now 6100
Now to the last step in our data cleaning process before we start our analysis, we need to remove all the non-free apps as due to our goal, which is to find out what type of 'Free' apps attracts the most users.
def check_free(price):
if price == 0.0:
return True
return False
Now let's test our new function against a free and a non-free app and see the result.
print(ios_header,'\n')
print(apps_ios[10],' >>> ',check_free(float(apps_ios[10][4])),'\n')
print(apps_ios[11],' >>> ',check_free(float(apps_ios[11][4])))
['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic'] ['512939461', 'Subway Surfers', '156038144', 'USD', '0.0', '706110', '97', '4.5', '4.0', '1.72.1', '9+', 'Games', '38', '5', '1', '1'] >>> True ['362949845', 'Fruit Ninja Classic', '104590336', 'USD', '1.99', '698516', '132', '4.5', '4.0', '2.3.9', '4+', 'Games', '38', '5', '13', '1'] >>> False
print('Google data set length was ',len(apps_google_english))
print('Apple data set length was ',len(apps_ios_english),'\n')
apps_google_final = []
apps_ios_final = []
for row in apps_google_english:
try:
price = float(row[7])
except:
price = float(row[7][1:])
if check_free(price):
apps_google_final.append(row)
for row in apps_ios_english:
price = float(row[4])
if check_free(price):
apps_ios_final.append(row)
print('Google data set length is now ',len(apps_google_final))
print('Apple data set length is now ',len(apps_ios_final))
Google data set length was 9500 Apple data set length was 6100 Google data set length is now 8758 Apple data set length is now 3169
The data cleaning process is finally over.
From 10841 entries for the Google dataset we ended up with 8758.
From 7197 entries for the Apple dataset we ended up with 3169.
Having a clean dataset that is filtered according to your needs ensures that your analysis will give correct results.
Now it's time to start analyzing our data, but first it's worth noting that our company has a validation strategy for the apps that they need to follow in order to minimize risk, this strategy has 3 steps:
Now the question is what app genre should the company start building a minimal Android version of?
To answer this question properly there is 2 steps to consider:
def freq_table(dataset,index):
table = {}
total = 0
for row in dataset:
key = row[index]
total += 1
if key in table:
table[key] += 1
else:
table[key] = 1
return table,total
genres_google_tot = freq_table(apps_google_final,1)[0]
genres_ios_tot = freq_table(apps_ios_final,11)[0]
def percent_table(dictionary,total):
for key in dictionary:
dictionary[key] = (dictionary[key]/total)*100
return dictionary
genres_google = percent_table(freq_table(apps_google_final,1)[0],freq_table(apps_google_final,1)[1])
genres_ios = percent_table(freq_table(apps_ios_final,11)[0],freq_table(apps_ios_final,11)[1])
print('''Google PlayStore genres frequency Table:
----------------------------------------''')
print(genres_google,'\n')
print('''Apple AppStore genres frequency Table:
--------------------------------------''')
print(genres_ios)
Google PlayStore genres frequency Table: ---------------------------------------- {'ART_AND_DESIGN': 0.6508335236355333, 'FAMILY': 18.96551724137931, 'AUTO_AND_VEHICLES': 0.9248686914820735, 'BEAUTY': 0.6051609956611099, 'BOOKS_AND_REFERENCE': 2.146608814797899, 'BUSINESS': 4.647179721397579, 'COMICS': 0.5823247316738982, 'COMMUNICATION': 3.254167618177666, 'DATING': 1.861155514957753, 'EDUCATION': 1.187485727335008, 'ENTERTAINMENT': 0.959123087462891, 'EVENTS': 0.7193423155971682, 'FINANCE': 3.722311029915506, 'FOOD_AND_DRINK': 1.2331582553094314, 'HEALTH_AND_FITNESS': 3.0943137702671843, 'HOUSE_AND_HOME': 0.7878511075588034, 'TOOLS': 8.472253939255538, 'LIBRARIES_AND_DEMO': 0.9020324274948619, 'LIFESTYLE': 3.916419273806805, 'GAME': 9.625485270609728, 'VIDEO_PLAYERS': 1.8040648549897238, 'MEDICAL': 3.539620918017812, 'SOCIAL': 2.6490066225165565, 'SHOPPING': 2.2493720027403517, 'PHOTOGRAPHY': 2.980132450331126, 'SPORTS': 3.3340945421329073, 'TRAVEL_AND_LOCAL': 2.3407170586891985, 'PERSONALIZATION': 3.288422014158484, 'PRODUCTIVITY': 3.9392555377940166, 'PARENTING': 0.6394153916419274, 'WEATHER': 0.7878511075588034, 'NEWS_AND_MAGAZINES': 2.808860470427038, 'MAPS_AND_NAVIGATION': 1.3815939712263074} Apple AppStore genres frequency Table: -------------------------------------- {'Social Networking': 3.2817923635216157, 'Photo & Video': 5.0489113284947935, 'Games': 58.53581571473651, 'Music': 2.0511202272010096, 'Reference': 0.5364468286525718, 'Health & Fitness': 1.9880088355948247, 'Weather': 0.8520037866834964, 'Utilities': 2.398232881035027, 'Travel': 1.1360050489113285, 'Shopping': 2.5244556642473968, 'News': 1.3253392237298833, 'Navigation': 0.18933417481855475, 'Lifestyle': 1.5462290943515304, 'Entertainment': 7.82581255916693, 'Food & Drink': 0.8204480908804039, 'Sports': 2.1773430104133795, 'Book': 0.3786683496371095, 'Finance': 1.1044493531082362, 'Education': 3.72357210476491, 'Productivity': 1.7040075733669928, 'Business': 0.5364468286525718, 'Catalogs': 0.12622278321236985, 'Medical': 0.18933417481855475}
Now that we have our genres frequency table generated we need to sort it so we can find which genres are the most frequent.
def display_table(dictionary):
values = []
for k,v in dictionary.items():
values.append((v,k))
for v,k in sorted(values,reverse=True):
print(k,':',round(v,2))
print('''Google PlayStore genres frequency Table:
----------------------------------------''')
display_table(genres_google)
print('''\n\nApple AppStore genres frequency Table:
--------------------------------------''')
display_table(genres_ios)
Google PlayStore genres frequency Table: ---------------------------------------- FAMILY : 18.97 GAME : 9.63 TOOLS : 8.47 BUSINESS : 4.65 PRODUCTIVITY : 3.94 LIFESTYLE : 3.92 FINANCE : 3.72 MEDICAL : 3.54 SPORTS : 3.33 PERSONALIZATION : 3.29 COMMUNICATION : 3.25 HEALTH_AND_FITNESS : 3.09 PHOTOGRAPHY : 2.98 NEWS_AND_MAGAZINES : 2.81 SOCIAL : 2.65 TRAVEL_AND_LOCAL : 2.34 SHOPPING : 2.25 BOOKS_AND_REFERENCE : 2.15 DATING : 1.86 VIDEO_PLAYERS : 1.8 MAPS_AND_NAVIGATION : 1.38 FOOD_AND_DRINK : 1.23 EDUCATION : 1.19 ENTERTAINMENT : 0.96 AUTO_AND_VEHICLES : 0.92 LIBRARIES_AND_DEMO : 0.9 WEATHER : 0.79 HOUSE_AND_HOME : 0.79 EVENTS : 0.72 ART_AND_DESIGN : 0.65 PARENTING : 0.64 BEAUTY : 0.61 COMICS : 0.58 Apple AppStore genres frequency Table: -------------------------------------- Games : 58.54 Entertainment : 7.83 Photo & Video : 5.05 Education : 3.72 Social Networking : 3.28 Shopping : 2.52 Utilities : 2.4 Sports : 2.18 Music : 2.05 Health & Fitness : 1.99 Productivity : 1.7 Lifestyle : 1.55 News : 1.33 Travel : 1.14 Finance : 1.1 Weather : 0.85 Food & Drink : 0.82 Reference : 0.54 Business : 0.54 Book : 0.38 Navigation : 0.19 Medical : 0.19 Catalogs : 0.13
By studying the frequency tables above we can conclude the following:
Note that the 'Family' Category in the Playstore resembles games that are made for kids as you can see in the following image.
Also keep in mind that these conclusions are only true for free English apps and not for the whole market.
However, it's worth noting that being the "most frequent" category doesn't necessarily mean that it's the "most used" category since the supply and the demand may not be the same.
As mentioned before we will now create a dictionary with the genres as keys and the values as:
It's worth noting that the number of installs we have in our data set for Google apps is not precise as you can see in the next code result, for example 100,000+ installs might be any number between 100,000 and 500,000 which is a very large range. However, for the purpose of our analysis the exact number isn't needed and we can work our way with the numbers at hand and still get accepted results.
google_installs = freq_table(apps_google_final,5)[0]
display_table(google_installs)
1,000,000+ : 1380 100,000+ : 1009 10,000,000+ : 926 10,000+ : 894 1,000+ : 733 100+ : 609 5,000,000+ : 603 500,000+ : 487 50,000+ : 418 5,000+ : 393 10+ : 308 500+ : 281 50,000,000+ : 199 100,000,000+ : 186 50+ : 169 5+ : 69 1+ : 45 500,000,000+ : 24 1,000,000,000+ : 20 0+ : 4 0 : 1
used_google={}
for row in apps_google_final:
genre = row[1]
if len(row[5]) > 1:
installs = int(row[5][:-1].replace(',','')) #To avoid the '+' sign
else:
installs = int(row[5])
if genre in used_google:
used_google[genre] += installs
else:
used_google[genre] = installs
genres_google_tot = freq_table(apps_google_final,1)[0] #This is the genre frequency table generated before
for k in used_google:
used_google[k] /= genres_google_tot[k] #dividing by the total number of apps that belongs to a specific genre
print('''Google PlayStore average Installs Table:
----------------------------------------''')
display_table(used_google)
Google PlayStore average Installs Table: ---------------------------------------- COMMUNICATION : 38550548.04 VIDEO_PLAYERS : 24878048.86 SOCIAL : 23628689.23 PHOTOGRAPHY : 17805627.64 PRODUCTIVITY : 16787331.34 GAME : 15543964.82 TRAVEL_AND_LOCAL : 14120454.08 ENTERTAINMENT : 11767380.95 TOOLS : 10782301.18 NEWS_AND_MAGAZINES : 9626407.36 BOOKS_AND_REFERENCE : 8329168.94 SHOPPING : 7103190.79 PERSONALIZATION : 5240358.99 WEATHER : 5212877.1 HEALTH_AND_FITNESS : 4219697.06 MAPS_AND_NAVIGATION : 4115374.21 SPORTS : 3750580.64 FAMILY : 3714649.72 ART_AND_DESIGN : 1986335.09 FOOD_AND_DRINK : 1951283.81 EDUCATION : 1820673.08 BUSINESS : 1712290.15 LIFESTYLE : 1447458.98 HOUSE_AND_HOME : 1385541.46 FINANCE : 1365500.4 DATING : 861409.55 COMICS : 859042.16 AUTO_AND_VEHICLES : 654074.83 LIBRARIES_AND_DEMO : 649314.05 PARENTING : 552875.18 BEAUTY : 513151.89 EVENTS : 253542.22 MEDICAL : 121230.14
From the previous output we can notice that the top 3 genres installed by users are:
Lets dive deeper into these categories.
First lets explore apps with the highest number of installs that belong to the Communication genre.
print('''Apps between 100M & 1B
---------------------''')
for row in apps_google_final:
installs = int(row[5].replace(',','').replace('+',''))
if row[1] == 'COMMUNICATION' and 100000000 <= installs <= 1000000000: #between 100Million and 1Billion
print(row[0],':',row[5])
Apps between 100M & 1B --------------------- Messenger – Text and Video Chat for Free : 1,000,000,000+ WhatsApp Messenger : 1,000,000,000+ Google Chrome: Fast & Secure : 1,000,000,000+ Messenger Lite: Free Calls & Messages : 100,000,000+ Gmail : 1,000,000,000+ Hangouts : 1,000,000,000+ Viber Messenger : 500,000,000+ Firefox Browser fast & private : 100,000,000+ Yahoo Mail – Stay Organized : 100,000,000+ imo beta free calls and text : 100,000,000+ imo free video calls and chat : 500,000,000+ Opera Mini - fast web browser : 100,000,000+ Opera Browser: Fast and Secure : 100,000,000+ Who : 100,000,000+ WeChat : 100,000,000+ UC Browser Mini -Tiny Fast Private & Secure : 100,000,000+ Android Messages : 100,000,000+ Telegram : 100,000,000+ Google Duo - High Quality Video Calls : 500,000,000+ UC Browser - Fast Download Private & Secure : 500,000,000+ Skype - free IM & video calls : 1,000,000,000+ GO SMS Pro - Messenger, Free Themes, Emoji : 100,000,000+ LINE: Free Calls & Messages : 500,000,000+ BBM - Free Calls & Messages : 100,000,000+ KakaoTalk: Free Calls & Text : 100,000,000+ Truecaller: Caller ID, SMS spam blocking & Dialer : 100,000,000+ Kik : 100,000,000+
From the previous output we conclude that the 'Communication' genre is dominated by big name companies like whatsapp, facebook messenger and skype which skews the average number of installs making it the highest genre in terms of average number of installs, however this doesn't mean that all the apps in this genre has a high number of installs. For example have a look at the following apps that have less than 50 million installs (relatively low)
for row in apps_google_final:
installs = int(row[5].replace(',','').replace('+',''))
if row[1] == 'COMMUNICATION' and installs < 50000000: #Less than 50Million
print(row[0],':',row[5])
Messenger for SMS : 10,000,000+ My Tele2 : 5,000,000+ Call Free – Free Call : 5,000,000+ Web Browser & Explorer : 5,000,000+ Browser 4G : 10,000,000+ MegaFon Dashboard : 10,000,000+ ZenUI Dialer & Contacts : 10,000,000+ Cricket Visual Voicemail : 10,000,000+ TracFone My Account : 1,000,000+ Firefox Focus: The privacy browser : 1,000,000+ Google Voice : 10,000,000+ Chrome Dev : 5,000,000+ Xperia Linkâ„¢ : 10,000,000+ TouchPal Keyboard - Fun Emoji & Android Keyboard : 10,000,000+ Skype Lite - Free Video Call & Chat : 5,000,000+ WhatsApp Business : 10,000,000+ My magenta : 1,000,000+ Puffin Web Browser : 10,000,000+ Seznam.cz : 1,000,000+ Antillean Gold Telegram (original version) : 100,000+ AT&T Visual Voicemail : 10,000,000+ GMX Mail : 10,000,000+ Omlet Chat : 10,000,000+ My Vodacom SA : 5,000,000+ Microsoft Edge : 5,000,000+ Hangouts Dialer - Call Phones : 10,000,000+ Talkatone: Free Texts, Calls & Phone Number : 10,000,000+ Calls & Text by Mo+ : 5,000,000+ Messaging+ SMS, MMS Free : 1,000,000+ chomp SMS : 10,000,000+ Glide - Video Chat Messenger : 10,000,000+ Text SMS : 10,000,000+ Google Allo : 10,000,000+ Talkray - Free Calls & Texts : 10,000,000+ GroupMe : 10,000,000+ mysms SMS Text Messaging Sync : 1,000,000+ 2ndLine - Second Phone Number : 1,000,000+ Ninesky Browser : 1,000,000+ Ghostery Privacy Browser : 1,000,000+ InBrowser - Incognito Browsing : 1,000,000+ Web Browser for Android : 1,000,000+ DU Browser—Browse fast & fun : 10,000,000+ Lightning Web Browser : 500,000+ Web Browser : 500,000+ Contacts+ : 10,000,000+ ExDialer - Dialer & Contacts : 10,000,000+ PHONE for Google Voice & GTalk : 1,000,000+ Safest Call Blocker : 1,000,000+ Full Screen Caller ID : 5,000,000+ Hiya - Caller ID & Block : 10,000,000+ Mr. Number-Block calls & spam : 10,000,000+ Should I Answer? : 1,000,000+ RocketDial Dialer & Contacts : 1,000,000+ CallApp: Caller ID, Blocker & Phone Call Recorder : 10,000,000+ Whoscall - Caller ID & Block : 10,000,000+ CIA - Caller ID & Call Blocker : 5,000,000+ Calls Blacklist - Call Blocker : 10,000,000+ Call Control - Call Blocker : 5,000,000+ True Contact - Real Caller ID : 1,000,000+ Video Caller Id : 1,000,000+ Sync.ME – Caller ID & Block : 5,000,000+ Burner - Free Phone Number : 1,000,000+ Caller ID + : 1,000,000+ K-9 Mail : 5,000,000+ myMail – Email for Hotmail, Gmail and Outlook Mail : 10,000,000+ Email TypeApp - Mail App : 1,000,000+ All Email Providers : 1,000,000+ Newton Mail - Email App for Gmail, Outlook, IMAP : 1,000,000+ GO Notifier : 10,000,000+ Mail1Click - Secure Mail : 10,000+ Daum Mail - Next Mail : 5,000,000+ mail.com mail : 1,000,000+ SolMail - All-in-One email app : 500,000+ Vonage Mobileآ® Call Video Text : 1,000,000+ JusTalk - Free Video Calls and Fun Video Chat : 5,000,000+ LokLok: Draw on a Lock Screen : 500,000+ Discord - Chat for Gamers : 10,000,000+ AntennaPict خ² : 1,000,000+ K-@ Mail - Email App : 100,000+ K-9 Material (unofficial) : 5,000+ M star Dialer : 100,000+ Free WiFi Connect : 10,000,000+ m:go BiH : 10,000+ N-Com Wizard : 50,000+ Opera Mini browser beta : 10,000,000+ Psiphon Pro - The Internet Freedom VPN : 10,000,000+ ICQ — Video Calls & Chat Messenger : 10,000,000+ AT&T Messages for Tablet : 1,000,000+ T-Mobile DIGITS : 100,000+ Portable Wi-Fi hotspot : 10,000,000+ AT&T Call Protect : 5,000,000+ U - Webinars, Meetings & Messenger : 500,000+ /u/app : 10,000+ [verify-U] VideoIdent : 10,000+ WhatsCall Free Global Phone Call App & Cheap Calls : 10,000,000+ X Browser : 50,000+ Free Adblocker Browser - Adblock & Popup Blocker : 10,000,000+ Adblock Browser for Android : 10,000,000+ Adblock Plus for Samsung Internet - Browse safe. : 1,000,000+ Ad Blocker Turbo - Adblocker Browser : 10,000+ Brave Browser: Fast AdBlocker : 5,000,000+ AG Contacts, Lite edition : 5,000+ Oklahoma Ag Co-op Council : 10+ Bee'ah Employee App : 100+ tournaments and more.aj.2 : 100+ Aj.Petra : 100+ AK Phone : 5,000+ PlacarTv Futebol Ao Vivo : 100,000+ WiFi Access Point (hotspot) : 100,000+ Access Point Names : 10,000+ ClanHQ : 10,000+ Ear Agent: Super Hearing : 5,000,000+ AU Call Blocker - Block Unwanted Calls Texts 2018 : 1,000+ Baby Monitor AV : 100,000+ AV Phone : 1,000+ AW - free video calls and chat : 1,000,000+ Katalogen.ax : 100+ AZ Browser. Private & Download : 100,000+ BA SALES : 1+ BD Data Plan (3G & 4G) : 500,000+ BD Internet Packages (Updated) : 50,000+ BD Dialer : 10,000+ BD Live Call : 5,000+ Best Browser BD social networking : 10+ Traffic signs BD : 500+ BF Browser by Betfilter - Stop Gambling Today! : 10,000+ My BF App : 50,000+ BH Mail : 1,000+ BJ - Confidential : 10+ BK Chat : 1,000+ Of the wall Arapaho bk : 5+ AC-BL : 50+ DMR BrandMeister Tool : 10,000+ BBMoji - Your personalized BBM Stickers : 1,000,000+ BN MALLORCA Radio : 1,000+ BQ Partners : 1,000+ BS-Mobile : 50+ ATC Unico BS : 500+ BT One Voice mobile access : 5,000+ BT Messenger : 50,000+ BT One Phone Mobile App : 10,000+ SW-100.tch by Callstel : 1,000,000+ BT MeetMe with Dolby Voice : 100,000+ Bluetooth Auto Connect : 5,000,000+ AudioBT: BT audio GPS/SMS/Text : 50,000+ BV : 100+ Feel Performer : 10,000+ Tiny Call Confirm : 1,000,000+ CB Radio Chat - for friends! : 1,000,000+ CB On Mobile : 100,000+ Virtual Walkie Talkie : 1,000,000+ Channel 19 : 100,000+ Cb browser : 50+ CF Chat: Connecting Friends : 100+ retteMi.ch : 5,000+ CJ Browser - Fast & Private : 100+ CJ DVD Rentals : 100+ CK Call NEW : 10+ CM Transfer - Share any files with friends nearby : 5,000,000+ mail.co.uk Mail : 5,000+ ClanPlay: Community and Tools for Gamers : 1,000,000+ CQ-Mobile : 1,000+ CQ-Alert : 500+ QRZ Assistant : 100,000+ Pocket Prefix Plus : 10,000+ Ham Radio Prefixes : 10,000+ CS Customizer : 1,000+ CS Browser | #1 & BEST BROWSER : 1,000+ CS Browser Beta : 5,000+ My Vodafone (GR) : 1,000,000+ IZ2UUF Morse Koch CW : 50,000+ C W Browser : 100+ CW Bluetooth SPP : 100+ CW BLE Peripheral Simulator : 500+ Morse Code Reader : 100,000+ Learn Morse Code - G0HYN Learn Morse : 5,000+ Ring : 10,000+ Hyundai CX Conference : 50+ Cy Messenger : 100+ Amadeus GR & CY : 100+ SMS Sender - sluzba.cz : 1,000+ WEB.DE Mail : 10,000,000+ Your Freedom VPN Client : 5,000,000+ Rأ،dio Sol Nascente DF : 500+ DG Card : 100+ DK Browser : 10+ cluster.dk : 1,000+ DK TEL Dialer : 50+ DM for WhatsApp : 5,000+ DM Talk New : 5,000+ DM - The Offical Messaging App : 10+ DM Tracker : 1,000+ Call Blocker & Blacklist : 1,000+ ReadyOp DT : 1,000+ Caller ID & Call Block - DU Caller : 5,000,000+ BlueDV AMBE : 1,000+ DW Contacts & Phone & Dialer : 1,000,000+ Deaf World DW : 10,000+ Ham DX Cluster & Spots Finder : 5,000+ Mircules DX Cluster Lite : 5,000+ 3G DZ Configuration : 50,000+ chat dz : 100+ love sms good morning : 5,000+ Goodbox - Mega App : 100,000+ Call Blocker - Blacklist, SMS Blocker : 1,000,000+ [EF]ShoutBox : 100+ Eg Call : 10,000+ ei : 10+ EJ messenger : 10+ Ek IRA : 10+ Orfox: Tor Browser for Android : 10,000,000+ EO Mumbai : 10+ EP RSS Reader : 100+ Voxer Walkie Talkie Messenger : 10,000,000+ ES-1 : 500+ EU Council : 1,000+ Council Voting Calculator : 5,000+ Have your say on Europe : 500+ Programi podrإ،ke EU : 100+ Inbox.eu : 10,000+ Everbridge : 100,000+ Best Auto Call Recorder Free : 500+ EZ Wifi Notification : 10,000+ Test Server SMS FA : 5+ Lite for Facebook Messenger : 1,000,000+ FC Browser - Focus Privacy Browser : 1,000+ EHiN-FH conferenceapp : 100+ Carpooling FH Hagenberg : 100+ Wi-Fi Auto-connect : 1,000,000+ Talkie - Wi-Fi Calling, Chats, File Sharing : 500,000+ WeFi - Free Fast WiFi Connect & Find Wi-Fi Map : 1,000,000+ Sat-Fi : 5,000+ Portable Wi-Fi hotspot Free : 100,000+ TownWiFi | Wi-Fi Everywhere : 500,000+ Jazz Wi-Fi : 10,000+ Sat-Fi Voice : 1,000+ Free Wi-fi HotspoT : 50,000+ FN Web Radio : 10+ FNH Payment Info : 10+ MARKET FO : 100+ FO OP St-Nazaire : 100+ FO SODEXO : 100+ FO RCBT : 100+ FO Interim : 100+ FO PSA Sept-Fons : 100+ FO AIRBUS TLSE : 1,000+ FO STELIA Mأ©aulte : 100+ FO AIRBUS Nantes : 100+ FP Connect : 100+ FreedomPop Messaging Phone/SIM : 500,000+ FP Live : 10+ HipChat - beta version : 50,000+
We can see that most of the apps has less than 10M installs which is less than the average (~38M), which means that in order to build an app in this category we will have to compete with giants like Facebook, Whatsapp and Skype and therefore we wont be building an app in this category.
Lets explore the rest of the categories in the same manner.
print('''Apps between 100M & 1B
---------------------''')
for row in apps_google_final:
installs = int(row[5].replace(',','').replace('+',''))
if row[1] == 'VIDEO_PLAYERS' and 100000000 <= installs <= 1000000000: #between 100Million and 1Billion
print(row[0],':',row[5])
Apps between 100M & 1B --------------------- YouTube : 1,000,000,000+ Motorola FM Radio : 100,000,000+ Motorola Gallery : 100,000,000+ VLC for Android : 100,000,000+ Google Play Movies & TV : 1,000,000,000+ MX Player : 500,000,000+ Dubsmash : 100,000,000+ VivaVideo - Video Editor & Photo Movie : 100,000,000+ VideoShow-Video Editor, Video Maker, Beauty Camera : 100,000,000+
print('''Apps between 100M & 1B
---------------------''')
for row in apps_google_final:
installs = int(row[5].replace(',','').replace('+',''))
if row[1] == 'SOCIAL' and 100000000 <= installs <= 1000000000: #between 100Million and 1Billion
print(row[0],':',row[5])
Apps between 100M & 1B --------------------- Facebook : 1,000,000,000+ Instagram : 1,000,000,000+ Facebook Lite : 500,000,000+ Tumblr : 100,000,000+ Snapchat : 500,000,000+ Pinterest : 100,000,000+ Google+ : 1,000,000,000+ LinkedIn : 100,000,000+ Badoo - Free Chat & Dating App : 100,000,000+ Tango - Live Video Broadcast : 100,000,000+ Tik Tok - including musical.ly : 100,000,000+ BIGO LIVE - Live Stream : 100,000,000+ VK : 100,000,000+
It's concluded that the top 3 categories are skewed by whales and big name companies and that the average number of installs is misleading, we wont be competing in any of these categories.
Next we will explore the Game category, which we found out before that it's the most common on both markets.
print('''Apps between 100M & 1B
---------------------''')
for row in apps_google_final:
installs = int(row[5].replace(',','').replace('+',''))
if row[1] == 'GAME' and 100000000 <= installs <= 1000000000: #between 100Million and 1Billion
print(row[0],':',row[5])
Apps between 100M & 1B --------------------- Subway Surfers : 1,000,000,000+ Candy Crush Saga : 500,000,000+ slither.io : 100,000,000+ Clash Royale : 100,000,000+ Temple Run 2 : 500,000,000+ Pou : 500,000,000+ Helix Jump : 100,000,000+ Angry Birds Rio : 100,000,000+ Plants vs. Zombies FREE : 100,000,000+ Sonic Dash : 100,000,000+ Candy Crush Soda Saga : 100,000,000+ Clash of Clans : 100,000,000+ PAC-MAN : 100,000,000+ 8 Ball Pool : 100,000,000+ Angry Birds Classic : 100,000,000+ Flow Free : 100,000,000+ Zombie Tsunami : 100,000,000+ Hill Climb Racing : 100,000,000+ My Talking Angela : 100,000,000+ Cut the Rope FULL FREE : 100,000,000+ Sniper 3D Gun Shooter: Free Shooting Games - FPS : 100,000,000+ Cooking Fever : 100,000,000+ Score! Hero : 100,000,000+ Garena Free Fire : 100,000,000+ My Talking Tom : 500,000,000+ Roll the Ballآ® - slide puzzle : 100,000,000+ Talking Tom Gold Run : 100,000,000+ Dream League Soccer 2018 : 100,000,000+ Traffic Racer : 100,000,000+ Hill Climb Racing 2 : 100,000,000+ Hungry Shark Evolution : 100,000,000+ Piano Tiles 2â„¢ : 100,000,000+ Pokأ©mon GO : 100,000,000+ Extreme Car Driving Simulator : 100,000,000+ Trivia Crack : 100,000,000+ Angry Birds 2 : 100,000,000+ Yes day : 100,000,000+ Crossy Road : 100,000,000+ Shadow Fight 2 : 100,000,000+ Agar.io : 100,000,000+ Bus Rush: Subway Edition : 100,000,000+ Jetpack Joyride : 100,000,000+ Super Mario Run : 100,000,000+ Glow Hockey : 100,000,000+ Asphalt 8: Airborne : 100,000,000+ Fruit Ninjaآ® : 100,000,000+ Vector : 100,000,000+ Dr. Driving : 100,000,000+ Bike Race Free - Top Motorcycle Racing Games : 100,000,000+ Smash Hit : 100,000,000+ Temple Run : 100,000,000+ Geometry Dash Lite : 100,000,000+ Ant Smasher by Best Cool & Fun Games : 100,000,000+ Angry Birds Star Wars : 100,000,000+ Mobile Legends: Bang Bang : 100,000,000+ Banana Kong : 100,000,000+ Skater Boy : 100,000,000+ Modern Combat 5: eSports FPS : 100,000,000+
Alot of popular games available plus it's observable that the Games market is pretty saturated, developing an app in this genre wont be the smartest thing to do since we will be competing with a very large number of other apps.
Now remembering our first analysis of the most common genres in the playstore, the Entertainment category had pretty low percentage of the market (0.96%), however, it has a large number of installs (11,767,380). Having low supply and high demand makes this category a potential candidate for our next app.
Lets explore it even further.
print('''Apps between 100M & 1B
---------------------''')
for row in apps_google_final:
installs = int(row[5].replace(',','').replace('+',''))
if row[1] == 'ENTERTAINMENT' and 100000000 <= installs <= 1000000000: #between 100Million and 1Billion
print(row[0],':',row[5])
print('''\n\n\nApps between 1M & 50M
---------------------''')
for row in apps_google_final:
installs = int(row[5].replace(',','').replace('+',''))
if row[1] == 'ENTERTAINMENT' and 1000000 <= installs <= 50000000: #between 1M and 50M
print(row[0],':',row[5])
Apps between 100M & 1B --------------------- Netflix : 100,000,000+ Hotstar : 100,000,000+ Talking Angela : 100,000,000+ IMDb Movies & TV : 100,000,000+ Talking Ben the Dog : 100,000,000+ Apps between 1M & 50M --------------------- Complete Spanish Movies : 1,000,000+ Pluto TV - It’s Free TV : 1,000,000+ Tubi TV - Free Movies & TV : 10,000,000+ Mobile TV : 10,000,000+ TV+ : 5,000,000+ Digital TV : 5,000,000+ Motorola Spotlight Player™ : 10,000,000+ Vigo Lite : 5,000,000+ Peers.TV: broadcast TV channels First, Match TV, TNT ... : 5,000,000+ The green alien dance : 1,000,000+ Spectrum TV : 5,000,000+ H TV : 5,000,000+ StarTimes - Live International Champions Cup : 1,000,000+ Cinematic Cinematic : 1,000,000+ MEGOGO - Cinema and TV : 10,000,000+ DStv Now : 5,000,000+ ivi - movies and TV shows in HD : 10,000,000+ Radio Javan : 1,000,000+ Viki: Asian TV Dramas & Movies : 10,000,000+ Talking Ginger 2 : 50,000,000+ Girly Lock Screen Wallpaper with Quotes : 5,000,000+ Movies by Flixster, with Rotten Tomatoes : 10,000,000+ Low Poly – Puzzle art game : 1,000,000+ BBC Media Player : 10,000,000+ Amazon Prime Video : 50,000,000+ Adult Glitter Color by Number Book - Sandbox Pages : 1,000,000+ Twitch: Livestream Multiplayer Games & Esports : 50,000,000+ Ziggo GO : 1,000,000+ YouTube Gaming : 5,000,000+ PlayStation App : 50,000,000+ Cinemark Theatres : 1,000,000+ Regal Cinemas : 1,000,000+ Fandango Movies - Times + Tickets : 10,000,000+ AMC Theatres : 1,000,000+ VRV: Anime, game videos & more : 1,000,000+ DramaFever: Stream Asian Drama Shows & Movies : 1,000,000+ Crunchyroll - Everything Anime : 10,000,000+ Investigation Discovery GO : 1,000,000+ Crackle - Free TV & Movies : 10,000,000+ CBS - Full Episodes & Live TV : 10,000,000+ STARZ : 10,000,000+ HISTORY: Watch TV Show Full Episodes & Specials : 1,000,000+ VH1 : 1,000,000+ FOX NOW - On Demand & Live TV : 5,000,000+ BET NOW - Watch Shows : 1,000,000+ Univision NOW - Live TV and On Demand : 1,000,000+ SHOWTIME : 1,000,000+ Lifetime - Watch Full Episodes & Original Movies : 1,000,000+ SeriesGuide – Show & Movie Manager : 1,000,000+ Comedy Central : 1,000,000+ WWE : 10,000,000+ MTV : 1,000,000+ Showtime Anytime : 1,000,000+ FOX : 10,000,000+ Telemundo Now : 1,000,000+ Vudu Movies & TV : 10,000,000+ Yidio: TV Show & Movie Guide : 1,000,000+ Redbox : 10,000,000+ Nick Jr. - Shows & Games : 1,000,000+ ColorFul - Adult Coloring Book : 5,000,000+ Funny Pics : 1,000,000+ Funny Quotes Free : 1,000,000+ LOL Pics (Funny Pictures) : 1,000,000+ Meme Creator : 1,000,000+ Imgur: Find funny GIFs, memes & watch viral videos : 10,000,000+ SketchBook - draw and paint : 10,000,000+ Colorfy: Coloring Book for Adults - Free : 10,000,000+
Only a small number of apps dominate this category which leaves a space for competing, also, by studying apps that have installs between 1M & 50M we notice that:
Lets have a closer look on coloring apps only.
import re
for row in apps_google_final:
color = []
color = re.findall('.*Color.*',row[0])
if color == []:
color = re.findall('.*paint.*',row[0])
if row[1] == 'ENTERTAINMENT' and color != []:
print(color,':',row[5])
['Adult Glitter Color by Number Book - Sandbox Pages'] : 1,000,000+ ['ColorFul - Adult Coloring Book'] : 5,000,000+ ['SketchBook - draw and paint'] : 10,000,000+ ['Colorfy: Coloring Book for Adults - Free'] : 10,000,000+
Coloring apps can also be found in Family category, lets scan them too.
import re
for row in apps_google_final:
color = []
installs = int(row[5].replace(',','').replace('+',''))
color = re.findall('.*Color.*',row[0])
if color == []:
color = re.findall('.*paint.*',row[0])
if row[1] == 'FAMILY' and color != [] and installs >= 1000000:
print(color,':',row[5])
['Princess Coloring Book'] : 5,000,000+ ['No.Draw - Colors by Number 2018'] : 10,000,000+ ['Coloring & Learn'] : 5,000,000+ ['No. Color - Color by Number, Number Coloring'] : 10,000,000+ ['Draw.ly - Color by Number Pixel Art Coloring'] : 1,000,000+ ['Color by Number - Draw Sandbox Pixel Art'] : 1,000,000+ ['Sandbox - Color by Number Coloring Pages'] : 10,000,000+ ['Sandbox Art-Sandbox Color by Number Coloring Pages'] : 1,000,000+ ['Color By Number - Sandbox Pixel Coloring Book'] : 1,000,000+ ['PixelDot - Color by Number Sandbox Pixel Art'] : 1,000,000+ ['PixPanda - Color by Number Pixel Art Coloring Book'] : 1,000,000+ ['Sandbox Number Coloring Book Art - Color By Number'] : 1,000,000+ ['No.Color – Color by Number'] : 5,000,000+ ['PixBox Coloring - Color by number Sandbox'] : 1,000,000+ ['Draw Color by Number - Sandbox Pixel Art'] : 1,000,000+ ['Color by Number: Pixel Art'] : 1,000,000+ ['No.Diamond – Colors by Number'] : 1,000,000+ ['Coloring Book for Me & Mandala'] : 10,000,000+
By studying the previous results it's seen that there is a trend in coloring apps with an option to 'Color by number' for kids, from this observation we will make sure that our application that we are going to develop will also have this feature. However, we have to add other features so we can stand out, for example: we can have a section for kids and another for adults, add tutorials, add weekly competitions between users, add a place where users can share and discuss their paintings, add daily quotes from famous artists, etc..
Now lets study the Apple AppStore based on average number of ratings since our data doesn't have an installs field like the Google PlayStore, we also want to further explore the potential of coloring apps in the AppStore because we want our app to be successful on both platforms.
used_ios={}
for row in apps_ios_final:
genre = row[11]
rating_count_tot = int(row[5])
if genre in used_ios:
used_ios[genre] += rating_count_tot
else:
used_ios[genre] = rating_count_tot
for k in used_ios:
used_ios[k] /= genres_ios_tot[k]
print('''Apple AppStore average rating_count_tot Table:
----------------------------------------------''')
display_table(used_ios)
Apple AppStore average rating_count_tot Table: ---------------------------------------------- Navigation : 86090.33 Reference : 79350.47 Social Networking : 72916.55 Music : 58205.03 Weather : 54215.3 Book : 46384.92 Food & Drink : 33333.92 Finance : 32367.03 Travel : 31358.5 Photo & Video : 28441.54 Shopping : 27816.2 Health & Fitness : 24037.63 Sports : 23008.9 Games : 22985.21 Productivity : 21799.15 News : 21750.07 Utilities : 19900.47 Lifestyle : 16739.35 Entertainment : 14364.77 Business : 7491.12 Education : 7003.98 Catalogs : 4004.0 Medical : 612.0
for row in apps_ios_final:
color = []
color = re.findall('.*Color.*',row[1])
if color == []:
color = re.findall('.*paint.*',row[1])
ratings = int(row[5])
if row[11] == 'Entertainment' and color != []:
print(color,':',row[5])
['Colorfy: Coloring Book for Adults'] : 247809 ['Recolor - Coloring Book'] : 31180 ['Pigment - Coloring Book for Adults'] : 23967 ['ColorArt: Coloring Book For Adults'] : 15797 ['Hair Color Changer - Styles Salon & Recolor Booth'] : 11828 ['Coloring Book for Me - Coloring pages for adults'] : 7692 ['Pixel Color Ball Fell From The Sky'] : 6493 ['ibis Paint X - speed painting'] : 1933 ['Crayola Color Alive'] : 816 ['Disney Color and Play'] : 658 ['Quiver - 3D Coloring App'] : 325
The output implies that most of the coloring apps on the AppStore are made for adults which means that our app is going to stand out with it's kids section and the Color by number feature for kids, the idea of building a coloring app seems to fit well with our goal as they:
In this project, we analyzed data about the App Store and Google Play mobile apps with the goal of recommending an app profile that can be profitable for both markets.
We concluded that building a coloring app is a good potential for succeeding in both markets. Our app is going to have a section for kids with a color by number feature which was trending on the Google PlayStore and will also have a section for adults which was trending on the Apple AppStore, other features might include: tutorials, weekly competitions between users, add a place where users can share and discuss their paintings, add daily quotes from famous artists, etc..