What free app genres are most prevalent on the App Store and Google Play Store? Which have the highest levels of user engagement? Answering this question is critical for modelling user behavior in the mobile app market.
While simply classifying apps by popularity of genre neglects other important decision-relevant information (such as effort required to build an app, revenue potential, etc), sophisticated knowledge of trends in app stores enables developers to make data-driven decisions on what kind of apps they build.
There are ~4 million apps between the App store and Google Play store, and comprehensive data is unavailable. To approximate both markets, we'll use publicly available samples, uploaded to Kaggle, and obtained through web scraping.
The dataset is here, and contains roughly 10,000 apps. Key context and caveats:
The dataset, containing roughly 7,000 apps. The same caveats from the Play Store apply to the App Store.
The two functions below, open_dataset() and explore_dataset(), streamline exploring the datasets.
There are ~4 million apps between the App store and Google Play store, and comprehensive data is unavailable. To approximate both markets, we'll use publicly available samples, uploaded to Kaggle, and obtained through web scraping.
The dataset is here, and contains roughly 10,000 apps. Key context and caveats:
The dataset, containing roughly 7,000 apps. The same caveats from the Play Store apply to the App Store; note, that the App Store data originates from a more streamlined iTunes API.
The two functions below, open_dataset and explore_dataset, streamline exploring the datasets.
from csv import reader
def open_dataset(filename, header=True):
"""Opens a given dataset.
Args:
filename (str): A .csv file.
header (bool): Indicates whether the dataset includes a header row.
Returns:
If the dataset contains a header returns the header row and data separately.
If there is no header, only returns the data.
"""
opened_file = open(filename,encoding='utf-8')
read_file = reader(opened_file) # an iterator
data = list(read_file)
if header:
return data[0], data[1:]
else:
return data
# open both datasets
play_store_header, play_store = open_dataset('googleplaystore.csv')
app_store_header, app_store = open_dataset('AppleStore.csv')
def explore_data(dataset, start, end, rows_and_columns=False):
"""Prints rows from a dataset in a readable format.
Args:
dataset (list): The data set to be explored, a list of lists
start (int): Starting index of a slice of the data set
end (int): Ending index of slice
rows_and_columns (bool): Indicates whether to print number of rows and columns or not
Returns:
None
"""
dataset_slice = dataset[start:end]
for row in dataset_slice:
print(row)
print('\n') # add an empty row after each row
if rows_and_columns:
print('Number of rows:', len(dataset))
print('Number of columns:', len(dataset[0]))
# print headers, and the first couple rows for each dataset.
print("Play Store columns:\n")
print(play_store_header)
print('\n')
explore_data(play_store, 0, 2, True)
print("\nApp Store columns:\n")
print(app_store_header)
print('\n')
explore_data(app_store, 0, 2, True)
Play Store columns: ['App', 'Category', 'Rating', 'Reviews', 'Size', 'Installs', 'Type', 'Price', 'Content Rating', 'Genres', 'Last Updated', 'Current Ver', 'Android Ver'] ['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up'] ['Coloring book moana', 'ART_AND_DESIGN', '3.9', '967', '14M', '500,000+', 'Free', '0', 'Everyone', 'Art & Design;Pretend Play', 'January 15, 2018', '2.0.0', '4.0.3 and up'] Number of rows: 10841 Number of columns: 13 App Store columns: ['id', 'track_name', 'size_bytes', 'currency', 'price', 'rating_count_tot', 'rating_count_ver', 'user_rating', 'user_rating_ver', 'ver', 'cont_rating', 'prime_genre', 'sup_devices.num', 'ipadSc_urls.num', 'lang.num', 'vpp_lic'] ['284882215', 'Facebook', '389879808', 'USD', '0.0', '2974676', '212', '3.5', '3.5', '95.0', '4+', 'Social Networking', '37', '1', '29', '1'] ['389801252', 'Instagram', '113954816', 'USD', '0.0', '2161558', '1289', '4.5', '4.0', '10.23', '12+', 'Photo & Video', '37', '0', '29', '1'] Number of rows: 7197 Number of columns: 16
Of note, ad and microtransaction data is not included in either dataset. The columns for the Play Store are self-explanatory; Details for each column of the App Store are available on the dataset link above is less clear. Detailed descriptions of each column can be found in the App Store dataset link above.
There is a missing category for one of the apps listed, on row 10472. Discussion about the missing value can be found on this Kaggle thread. We'll deal with this by deleting the app.
explore_data(play_store, 10472, 10473) # the genre is missing # column 8, category, is missing
['Life Made WI-Fi Touchscreen Photo Frame', '1.9', '19', '3.0M', '1,000+', 'Free', '0', 'Everyone', '', 'February 11, 2018', '1.0.19', '4.0 and up']
del play_store[10472] # run only once
explore_data(play_store, 10472, 10473) # the app is replaced # the app has been replaced
['osmino Wi-Fi: free WiFi', 'TOOLS', '4.2', '134203', '4.1M', '10,000,000+', 'Free', '0', 'Everyone', 'Tools', 'August 7, 2018', '6.06.14', '4.4 and up']
There are duplicate apps in the Play Store dataset.
While the App Store uses an indexed appendix-like page suitable for web scraping, the Play Store relies on modern techniques like dynamic page load that make scraping difficult. In this section, I'll consolidate all groups of duplicates into the app version that has the greatest amount of reviews.
# create a list of duplicate apps
duplicates = []
unique_apps = []
for app in play_store:
name = app[0]
if name in unique_apps:
duplicates.append(name)
else:
unique_apps.append(name)
print('There are', len(duplicates), 'duplicates.')
There are 1181 duplicates.
# the app version with the most reviews is probably the most recent version
reviews_max = {}
for app in play_store:
name = app[0]
n_reviews = float(app[3])
if(name in reviews_max and reviews_max[name] < n_reviews or
name not in reviews_max): # update dictionary if we have a new max, or if there is no key yet
reviews_max[name] = n_reviews
print('Expected length:', len(play_store) - 1181)
print('Actual length:', len(reviews_max))
Expected length: 9659 Actual length: 9659
# clean the dataset and create play_store_clean
already_added = []
play_store_clean = []
for app in play_store:
name = app[0]
n_reviews = float(app[3])
if(name in reviews_max and n_reviews == reviews_max[name]
and name not in already_added):
play_store_clean.append(app)
already_added.append(name)
explore_data(play_store_clean,0,2,True)
['Photo Editor & Candy Camera & Grid & ScrapBook', 'ART_AND_DESIGN', '4.1', '159', '19M', '10,000+', 'Free', '0', 'Everyone', 'Art & Design', 'January 7, 2018', '1.0.0', '4.0.3 and up'] ['U Launcher Lite – FREE Live Cool Themes, Hide Apps', 'ART_AND_DESIGN', '4.7', '87510', '8.7M', '5,000,000+', 'Free', '0', 'Everyone', 'Art & Design', 'August 1, 2018', '1.2.4', '4.0.3 and up'] Number of rows: 9659 Number of columns: 13
Some of the apps in both stores are non-English. To simplify our analysis, we'll only consider apps directed towards English-speaking audiences by filtering out app names with non-English characters.
def check_ASCII(name):
'''Checks if there's any character in a string that is outside of the ASCII set.
Args:
name (str): The name of the app to be checked.
Returns:
A boolean indicating whether the string contains only ASCII characters.
'''
for char in name:
if ord(char) > 127:
return False
return True
print(check_ASCII("Instagram"))
print(check_ASCII('爱奇艺PPS -《欢乐颂2》电视剧热播'))
True False
The check_ASCII function is working well. However, some English apps can have characters from outside the ASCII set:
print(check_ASCII('Instachat 😜'))
print(check_ASCII('Docs To Go™ Free Office Suite'))
False False
Let's relax the criteria of the check_ASCII, such that only if four or more characters are outside the ASCII set, the function returns False.
def check_ASCII(name):
'''Checks if there's any character in a string that is outside of ASCII.
Args:
name (str): The name of the app to be checked.
Returns:
A boolean indicating whether the string contains only ASCII characters.
'''
counter = 0
for char in name:
if ord(char) > 127:
counter +=1
if counter > 3:
return False
else:
return True
print(check_ASCII('Instachat 😜'))
print(check_ASCII('Docs To Go™ Free Office Suite'))
True True
Let's now filter the dataset, so that we only have English apps, and store the data in lists play_store_English and app_store_English.
play_store_English = []
app_store_English = []
for app in play_store_clean:
name = app[0]
if check_ASCII(name) == True:
play_store_English.append(app)
for app in app_store:
name = app[1]
if check_ASCII(name) == True:
app_store_English.append(app)
Let's filter out free apps, by conditionally adding them to new lists based on if their price is 0.
play_store_free = []
app_store_free = []
for app in play_store_English:
if app[7] == '0':
play_store_free.append(app)
for app in app_store_English:
if app[4] == '0.0':
app_store_free.append(app)
Which app genres are popular on both markets? We'll create a frequency table to assess the percentage of apps among different genres. We'll analyze the genre breakdown for both the Google Play Store and App Store, identify similarities and differences, and derive conclusions.
def freq_table(dataset, index):
'''Creates a frequency table made of percentages given a dataset and category.
Args:
dataset: A list of lists
index (int): The category from which to generate a frequency table
Returns:
A dictionary that associates values with their relative prevalence in a dataset
'''
ft = {}
for row in dataset:
row[index]
if row[index] in ft:
ft[row[index]] += 1
else:
ft[row[index]] = 1
# make percentages
for key in ft:
ft[key] = ft[key] * 100 / len(dataset)
return ft
def display_table(dataset, index):
'''Displays a sorted frequency table in a readable format.
Args:
dataset: A list of lists
index (int): The category from which to display a frequency table
Returns:
None
'''
table = freq_table(dataset, index)
table_display = []
for key in table:
key_val_as_tuple = (table[key], key)
table_display.append(key_val_as_tuple)
table_sorted = sorted(table_display, reverse = True)
for entry in table_sorted:
if entry[0] > 1: # anything greater than 1% of apps
print(entry[1], ':', round(entry[0],2)) # displays to two decimal places.
display_table(play_store_free, 1) # Categories
print('\n')
display_table(play_store_free, -4) # Genre
FAMILY : 18.91 GAME : 9.72 TOOLS : 8.46 BUSINESS : 4.59 LIFESTYLE : 3.9 PRODUCTIVITY : 3.89 FINANCE : 3.7 MEDICAL : 3.53 SPORTS : 3.4 PERSONALIZATION : 3.32 COMMUNICATION : 3.24 HEALTH_AND_FITNESS : 3.08 PHOTOGRAPHY : 2.94 NEWS_AND_MAGAZINES : 2.8 SOCIAL : 2.66 TRAVEL_AND_LOCAL : 2.34 SHOPPING : 2.25 BOOKS_AND_REFERENCE : 2.14 DATING : 1.86 VIDEO_PLAYERS : 1.79 MAPS_AND_NAVIGATION : 1.4 FOOD_AND_DRINK : 1.24 EDUCATION : 1.16 Tools : 8.45 Entertainment : 6.07 Education : 5.35 Business : 4.59 Productivity : 3.89 Lifestyle : 3.89 Finance : 3.7 Medical : 3.53 Sports : 3.46 Personalization : 3.32 Communication : 3.24 Action : 3.1 Health & Fitness : 3.08 Photography : 2.94 News & Magazines : 2.8 Social : 2.66 Travel & Local : 2.32 Shopping : 2.25 Books & Reference : 2.14 Simulation : 2.04 Dating : 1.86 Arcade : 1.85 Video Players & Editors : 1.77 Casual : 1.76 Maps & Navigation : 1.4 Food & Drink : 1.24 Puzzle : 1.13
Categories are more inclusive than genres, which are relatively granular. While family and game apps are common, we see that there is a significant portion of practical apps, in tools, business, productivity, finance, etc. Let's visualize the categories:
import matplotlib.pyplot as plt
import seaborn as sns
category_ft = freq_table(play_store_free, 1)
test_x = ["a", 'b','c','d']
test_y = [1,2,3,4]
fig, ax = plt.subplots(figsize = (12,2))
sns.barplot(list(category_ft.keys()), list(category_ft.values()))
plt.xticks(rotation = 90)
plt.show()
We'll do a similar analysis for the App Store:
display_table(app_store_free, -5)
Games : 58.16 Entertainment : 7.88 Photo & Video : 4.97 Education : 3.66 Social Networking : 3.29 Shopping : 2.61 Utilities : 2.51 Sports : 2.14 Music : 2.05 Health & Fitness : 2.02 Productivity : 1.74 Lifestyle : 1.58 News : 1.33 Travel : 1.24 Finance : 1.12
category_ft = freq_table(app_store_free, -5)
sns.barplot(list(category_ft.keys()), list(category_ft.values()))
plt.xticks(rotation = 90)
plt.show()
We see that games dominate the App Store, in terms of sheer number of apps. The saturation in one genre contrasts with the Google Play Store. Furthermore, the prevalent genres are more geared towards fun; for example, entertainment, photos & videos, and social networking.
It's important to keep in mind that this is just a slice of a data, from a sample, and only including free and English-language apps. Number of apps also does not reveal number of users and engagement.
A better proxy for user engagement is number of ratings:
app_store_ft = freq_table(app_store_free, -5)
# calculate number of ratings
def genre_avg_n_ratings(dataset, genre_index, n_ratings_index):
'''Prints the average number of ratings for a given genre, and generate a bar plot
Args:
dataset: A list of lists
genre_index (int): index for the genre column (1 for Play Store, -5 for App Store)
n_ratings_index (int): index for the n_ratings column (3 for Play Store, 5 for App Store)
Returns:
None
'''
store_ft = freq_table(dataset, genre_index)
x_genres = []
y_n_ratings = []
for genre in store_ft:
x_genres.append(genre)
total = 0
len_genre = 0
for app in dataset:
genre_app = app[genre_index]
if genre_app == genre:
n_ratings = float(app[n_ratings_index])
total += n_ratings
len_genre +=1
avg_n_ratings = round(total / len_genre, 1)
y_n_ratings.append(avg_n_ratings)
print(genre + ": " + str(avg_n_ratings))
sns.barplot(x_genres, y_n_ratings)
plt.xticks(rotation = 90)
plt.show()
genre_avg_n_ratings(app_store_free, -5, 5)
Social Networking: 71548.3 Photo & Video: 28441.5 Games: 22788.7 Music: 57326.5 Reference: 74942.1 Health & Fitness: 23298.0 Weather: 52279.9 Utilities: 18684.5 Travel: 28243.8 Shopping: 26919.7 News: 21248.0 Navigation: 86090.3 Lifestyle: 16485.8 Entertainment: 14029.8 Food & Drink: 33333.9 Sports: 23008.9 Book: 39758.5 Finance: 31467.9 Education: 7004.0 Productivity: 21028.4 Business: 7491.1 Catalogs: 4004.0 Medical: 612.0
Navigation, music, and social networking apps have a high average number of ratings. Games also have high engagement -- considering the vast number of games on the platform, ~23k reviews per app is very respectable.
Based on average number of ratings some genres show promise in terms of user engagement: social networking, reference, and navigation. However, these are app genres that are saturated by huge companies. To demonstrate, let's print the top 10 apps for these genres:
def print_top_apps(genre, dataset = app_store_free, n_ratings_index=5, name_index=1, genre_index=-5, num=10):
'''Prints num of the top apps in a genre by number of ratings. Default args are for the App Store.
Args:
genre (str): Category of app
dataset (list): The list of rows corresponding to apps
n_ratings_index (int): The column index for number of ratings (3 for the Play Store)
name_index (int): The column index for the name of the app (0 for the Play Store)
genre_index (int): Column index for genre (1 for the Play Store)
num (int): How many apps to print
Returns:
None
'''
counter = 0
print(genre)
for app in dataset:
if app[genre_index] == genre:
print(app[name_index] + ': ' + app[n_ratings_index])
counter +=1
if counter == num: # only want to print num apps
break
print_top_apps('Social Networking')
print('\n')
print_top_apps('Reference')
print('\n')
print_top_apps('Navigation')
Social Networking Facebook: 2974676 Pinterest: 1061624 Skype for iPhone: 373519 Messenger: 351466 Tumblr: 334293 WhatsApp Messenger: 287589 Kik: 260965 ooVoo – Free Video Call, Text and Voice: 177501 TextNow - Unlimited Text + Calls: 164963 Viber Messenger – Text & Call: 164249 Reference Bible: 985920 Dictionary.com Dictionary & Thesaurus: 200047 Dictionary.com Dictionary & Thesaurus for iPad: 54175 Google Translate: 26786 Muslim Pro: Ramadan 2017 Prayer Times, Azan, Quran: 18418 New Furniture Mods - Pocket Wiki & Game Tools for Minecraft PC Edition: 17588 Merriam-Webster Dictionary: 16849 Night Sky: 12122 City Maps for Minecraft PE - The Best Maps for Minecraft Pocket Edition (MCPE): 8535 LUCKY BLOCK MOD ™ for Minecraft PC Edition - The Best Pocket Wiki & Mods Installer Tools: 4693 Navigation Waze - GPS Navigation, Maps & Real-time Traffic: 345046 Google Maps - Navigation & Transit: 154911 Geocaching®: 12811 CoPilot GPS – Car Navigation & Offline Maps: 3582 ImmobilienScout24: Real Estate Search in Germany: 187 Railway Route Search: 5
Apps like Facebook, the Bible, and Waze etc heavily skew the average ratings upward for these app genres For games, this is much less pronounced:
print_top_apps('Games')
Games Clash of Clans: 2130805 Temple Run: 1724546 Candy Crush Saga: 961794 Angry Birds: 824451 Subway Surfers: 706110 Solitaire: 679055 CSR Racing: 677247 Crossy Road - Endless Arcade Hopper: 669079 Injustice: Gods Among Us: 612532 Hay Day: 567344
The 10th game on the list, Hay Day, has more than three times the ratings as the 10th social networking app, Viber Messenger, despite the large average number of ratings for social networking apps.
Let's be a little more systematic and design a function to calculate the ratio between the number of ratings of the top app over the number ratings of the 20th app to get an rough idea of which genres are most tractable for small developers.
def rating_inequality(genre, dataset = app_store_free, n_ratings_index=5, name_index=1, genre_index=-5, num=20):
'''Prints num of the top apps in a genre by number of ratings. Default args are for the App Store.
Genres without a 20th ranking app are not included.
Args:
genre (str): Category of app
dataset (list): The list of rows corresponding to apps
n_ratings_index (int): The column index for number of ratings (3 for the Play Store)
name_index (int): The column index for the name of the app (0 for the Play Store)
genre_index (int): Column index for genre (1 for the Play Store)
num (int): The app rank to compare to the top app
Returns:
None
'''
counter = 0
for app in dataset:
if app[genre_index] == genre:
if counter == 0: top_rating = float(app[n_ratings_index])
counter +=1
if counter == num:
num_rating = float(app[n_ratings_index])
print(genre, '-', top_rating / num_rating)
return top_rating / num_rating
break
for genre in app_store_ft:
rating_inequality(genre)
Social Networking - 60.08232680266613 Photo & Video - 63.72329825182041 Games - 5.261155980020098 Music - 78.97946453602466 Health & Fitness - 74.57491186839013 Weather - 13395.297297297297 Utilities - 51.725105189340816 Travel - 489.23793859649123 Shopping - 17.232263652862564 News - 511.6445086705202 Lifestyle - 154.00493938033227 Entertainment - 5.581047381546135 Food & Drink - 10477.793103448275 Sports - 17.7599023497101 Finance - 154.8937583001328 Education - 22.60678060302904 Productivity - 12.586153004610455
By this metric, games, entertainment, productivity, shopping, and education apps have promising opportunities for smaller developers. Given the dominance of games and entertainment app on the App Store, iOS developers should remain focused on these genres. Other than these two genres, education and shopping apps are also particularly attractive due to their small, but significant proportion in the App Store.
While the gaming app market on the App Store is relatively saturated, the solid average number of user ratings make it the primary genre to recommend for aspriring iOS developers interested in growth.
Let's repeat a similar analysis for the Play Store, and make recommendations. The hard work for writing functions is already done, and the process is straightforward:
# the Play Store is not ordered by reviews
play_store_free = sorted(play_store_free, key = lambda x: float(x[3]), reverse=True)
# calculate number of ratings
play_store_ft = freq_table(play_store_free, 1)
genre_avg_n_ratings(play_store_free, 1, 3)
SOCIAL: 965831.0 COMMUNICATION: 995608.5 GAME: 683523.8 TOOLS: 305732.9 VIDEO_PLAYERS: 425350.1 NEWS_AND_MAGAZINES: 93088.0 PHOTOGRAPHY: 404081.4 FAMILY: 113143.0 TRAVEL_AND_LOCAL: 129484.4 PERSONALIZATION: 181122.3 MAPS_AND_NAVIGATION: 142860.0 SHOPPING: 223887.3 ENTERTAINMENT: 301752.2 PRODUCTIVITY: 160634.5 HEALTH_AND_FITNESS: 78095.0 SPORTS: 116938.6 BOOKS_AND_REFERENCE: 87995.1 LIFESTYLE: 33921.8 WEATHER: 171250.8 FINANCE: 38535.9 BUSINESS: 24239.7 EDUCATION: 56293.1 FOOD_AND_DRINK: 57478.8 COMICS: 42585.6 PARENTING: 16378.7 DATING: 21953.3 HOUSE_AND_HOME: 26435.5 LIBRARIES_AND_DEMO: 10925.8 ART_AND_DESIGN: 24699.4 AUTO_AND_VEHICLES: 14140.3 MEDICAL: 3730.2 BEAUTY: 7476.2 EVENTS: 2555.8
The Play Store data is much more mixed in terms of average number of reviews. Social, communication, and gaming apps have the majority of user ratings.
Let's get a little more context on the raw number of reviews withour ratings inequality function:
for genre in play_store_ft:
rating_inequality(genre, play_store_free, 3, 1, 1)
SOCIAL - 86.83638719024425 COMMUNICATION - 27.142346760262615 GAME - 7.2420067767956455 TOOLS - 23.084384774476028 VIDEO_PLAYERS - 59.57441546710384 NEWS_AND_MAGAZINES - 62.767118202750105 PHOTOGRAPHY - 7.483839421088904 FAMILY - 5.558485710710575 TRAVEL_AND_LOCAL - 42.679691110412776 PERSONALIZATION - 15.392347329070624 MAPS_AND_NAVIGATION - 128.07687131448 SHOPPING - 16.913286503852543 ENTERTAINMENT - 27.59062364112573 PRODUCTIVITY - 6.273359122845857 HEALTH_AND_FITNESS - 16.8683248610772 SPORTS - 8.501647466164416 BOOKS_AND_REFERENCE - 13.846642347554313 LIFESTYLE - 29.08894218236797 WEATHER - 64.26945799457995 FINANCE - 9.933793930809202 BUSINESS - 15.02377179080824 EDUCATION - 14.750612418787943 FOOD_AND_DRINK - 15.130145012450564 COMICS - 199.43823760818253 PARENTING - 339.2201030927835 DATING - 12.91033742101451 HOUSE_AND_HOME - 37.313125 LIBRARIES_AND_DEMO - 149.5195857721747 ART_AND_DESIGN - 194.49077733860344 AUTO_AND_VEHICLES - 53.349028840494405 MEDICAL - 14.604108309990663 BEAUTY - 117.9616182572614 EVENTS - 70.00523560209425
Based on rating inequality, games, photography, family, productivity, sports, and finance apps are promising opportunities for developers.
Games and photography apps are especially notable because of their high volume of ratings, and low ratio between the top app and 20th app in number of user ratings.
# Let's see some examples:
print_top_apps('GAME',play_store_free,3,0,1, 15)
print_top_apps('PHOTOGRAPHY',play_store_free,3,0,1,15)
GAME Clash of Clans: 44893888 Subway Surfers: 27725352 Clash Royale: 23136735 Candy Crush Saga: 22430188 My Talking Tom: 14892469 8 Ball Pool: 14201891 Shadow Fight 2: 10981850 Pou: 10486018 Pokémon GO: 10424925 Yes day: 10055521 Dream League Soccer 2018: 9883806 My Talking Angela: 9883367 Hill Climb Racing: 8923847 Asphalt 8: Airborne: 8389714 Mobile Legends: Bang Bang: 8219586 PHOTOGRAPHY Google Photos: 10859051 PicsArt Photo Studio: Collage Maker & Pic Editor: 7594559 PhotoGrid: Video & Pic Collage Maker, Photo Editor: 7529865 Retrica: 6120977 B612 - Beauty & Filter Camera: 5282578 Camera360: Selfie Photo Editor with Funny Sticker: 4865132 Candy Camera - selfie, beauty camera, photo editor: 3368705 YouCam Makeup - Magic Selfie Makeovers: 3337956 BeautyPlus - Easy Photo Editor & Selfie Camera: 3158151 Cymera Camera- Photo Editor, Filter,Collage,Layout: 2418165 Video Editor Music,Cut,No Crop: 2163282 Photo Editor Pro: 1871421 Keepsafe Photo Vault: Hide Private Photos & Videos: 1656808 YouCam Perfect - Selfie Photo Editor: 1579343 Photo Lab Picture Editor: face effects, art frames: 1536512
The App Store and Google Play store have a crucial difference in terms of their distributions of free app genres: the App Store is much more saturated by games to the point where it is hard to make other genre recommendations for small developers.
While developing games is a fair choice for the Google Play store, other genres like photography have high potential for developers.
Wild speculation: there's a variety of selfie apps and collage makers in the more popular photography apps on the Play Store -- perhaps a selfie-collage maker could gain traction?
There are still unanswered questions and possible improvements after this initial analysis. Here are some threads: