import pandas as pd
ios = pd.read_csv("appleAppData.csv")
android = pd.read_csv("Google-Playstore.csv")
The code below reveals the header column names for the Google Play Store apps dataset. The data in the code will be referred to as "android." The first rows of data are given below that as samples from the dataset. The relevant columns for this project are the "Category", "Rating", "Installs", "Price", "Released", and "Content Rating" columns. The "Content Rating" columns' values determine the suitability of the app for an audience, and a value of "Everyone" indicates that the app has no age restrictions. The "Category" column gives the genre of the app, and we will filter over this column to select all apps with the value of "Education." The "Released" column tells the date the app was released, and later we will be examining data from both the year and months listed from the time stamp. The "Rating" column gives the average user rating for the app, and the "Installs" column indicates the number of user installs with a plus sign, such as 100+ or 50+ installs. The information also provides the number of rows or apps included in the dataset, and the number of columns. The number of columns indicates the number of unique data points collected on each app.
android.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 2312944 entries, 0 to 2312943 Data columns (total 24 columns): # Column Dtype --- ------ ----- 0 App Name object 1 App Id object 2 Category object 3 Rating float64 4 Rating Count float64 5 Installs object 6 Minimum Installs float64 7 Maximum Installs int64 8 Free bool 9 Price float64 10 Currency object 11 Size object 12 Minimum Android object 13 Developer Id object 14 Developer Website object 15 Developer Email object 16 Released object 17 Last Updated object 18 Content Rating object 19 Privacy Policy object 20 Ad Supported bool 21 In App Purchases bool 22 Editors Choice bool 23 Scraped Time object dtypes: bool(4), float64(4), int64(1), object(15) memory usage: 361.8+ MB
android.head(3)
App Name | App Id | Category | Rating | Rating Count | Installs | Minimum Installs | Maximum Installs | Free | Price | ... | Developer Website | Developer Email | Released | Last Updated | Content Rating | Privacy Policy | Ad Supported | In App Purchases | Editors Choice | Scraped Time | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Gakondo | com.ishakwe.gakondo | Adventure | 0.0 | 0.0 | 10+ | 10.0 | 15 | True | 0.0 | ... | https://beniyizibyose.tk/#/ | jean21101999@gmail.com | Feb 26, 2020 | Feb 26, 2020 | Everyone | https://beniyizibyose.tk/projects/ | False | False | False | 2021-06-15 20:19:35 |
1 | Ampere Battery Info | com.webserveis.batteryinfo | Tools | 4.4 | 64.0 | 5,000+ | 5000.0 | 7662 | True | 0.0 | ... | https://webserveis.netlify.app/ | webserveis@gmail.com | May 21, 2020 | May 06, 2021 | Everyone | https://dev4phones.wordpress.com/licencia-de-uso/ | True | False | False | 2021-06-15 20:19:35 |
2 | Vibook | com.doantiepvien.crm | Productivity | 0.0 | 0.0 | 50+ | 50.0 | 58 | True | 0.0 | ... | NaN | vnacrewit@gmail.com | Aug 9, 2019 | Aug 19, 2019 | Everyone | https://www.vietnamairlines.com/vn/en/terms-an... | False | False | False | 2021-06-15 20:19:35 |
3 rows × 24 columns
The code below reveals the the header columns for the Apple App stores dataset. The data in the code will be referred to as "ios." The first rows of data are given below that as samples from the dataset. The relevant columns for this project are "Primary_Genre", "Content_Rating", "Released", "Price", and "Average_User_Rating." The "Primary_Genre" column gives the categories of the apps, and the apps falling under the Education category will be relevant for this proejct. "Released" column gives the date the app was released. The "Price" column gives the price to buy the app, and the "Content_Rating" column allows the user to know for which audience the app is suitable. For iOS Apple apps, the content ratings are different than Android Google apps. For Apple apps, a content rating of 4+ means that there is no objectionable material and the app is suitable for everyone. Apple apps have other content ratings such as 9+ (unsuitable for children under age 9), 12+ (unsuitable for children under age 12), and 17+ (unsuitable for children under age 17). We will filter the dataset to select those apps graded 4+ in order to be consistent with Google's "Everyone" content rating.
ios.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 1230376 entries, 0 to 1230375 Data columns (total 21 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 App_Id 1230376 non-null object 1 App_Name 1230375 non-null object 2 AppStore_Url 1230376 non-null object 3 Primary_Genre 1230376 non-null object 4 Content_Rating 1230376 non-null object 5 Size_Bytes 1230152 non-null float64 6 Required_IOS_Version 1230376 non-null object 7 Released 1230373 non-null object 8 Updated 1230376 non-null object 9 Version 1230376 non-null object 10 Price 1229886 non-null float64 11 Currency 1230376 non-null object 12 Free 1230376 non-null bool 13 DeveloperId 1230376 non-null int64 14 Developer 1230376 non-null object 15 Developer_Url 1229267 non-null object 16 Developer_Website 586388 non-null object 17 Average_User_Rating 1230376 non-null float64 18 Reviews 1230376 non-null int64 19 Current_Version_Score 1230376 non-null float64 20 Current_Version_Reviews 1230376 non-null int64 dtypes: bool(1), float64(4), int64(3), object(13) memory usage: 188.9+ MB
ios.head(3)
App_Id | App_Name | AppStore_Url | Primary_Genre | Content_Rating | Size_Bytes | Required_IOS_Version | Released | Updated | Version | ... | Currency | Free | DeveloperId | Developer | Developer_Url | Developer_Website | Average_User_Rating | Reviews | Current_Version_Score | Current_Version_Reviews | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | com.hkbu.arc.apaper | A+ Paper Guide | https://apps.apple.com/us/app/a-paper-guide/id... | Education | 4+ | 21993472.0 | 8.0 | 2017-09-28T03:02:41Z | 2018-12-21T21:30:36Z | 1.1.2 | ... | USD | True | 1375410542 | HKBU ARC | https://apps.apple.com/us/developer/hkbu-arc/i... | NaN | 0.0 | 0 | 0.0 | 0 |
1 | com.dmitriev.abooks | A-Books | https://apps.apple.com/us/app/a-books/id103157... | Book | 4+ | 13135872.0 | 10.0 | 2015-08-31T19:31:32Z | 2019-07-23T20:31:09Z | 1.3 | ... | USD | True | 1031572001 | Roman Dmitriev | https://apps.apple.com/us/developer/roman-dmit... | NaN | 5.0 | 1 | 5.0 | 1 |
2 | no.terp.abooks | A-books | https://apps.apple.com/us/app/a-books/id145702... | Book | 4+ | 21943296.0 | 9.0 | 2021-04-14T07:00:00Z | 2021-05-30T21:08:54Z | 1.3.1 | ... | USD | True | 1457024163 | Terp AS | https://apps.apple.com/us/developer/terp-as/id... | NaN | 0.0 | 0 | 0.0 | 0 |
3 rows × 21 columns
## drop some columns from the original database and make sure that it works
ios.drop(columns = ["App_Id"], axis=1, inplace = True)
ios.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 1230376 entries, 0 to 1230375 Data columns (total 20 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 App_Name 1230375 non-null object 1 AppStore_Url 1230376 non-null object 2 Primary_Genre 1230376 non-null object 3 Content_Rating 1230376 non-null object 4 Size_Bytes 1230152 non-null float64 5 Required_IOS_Version 1230376 non-null object 6 Released 1230373 non-null object 7 Updated 1230376 non-null object 8 Version 1230376 non-null object 9 Price 1229886 non-null float64 10 Currency 1230376 non-null object 11 Free 1230376 non-null bool 12 DeveloperId 1230376 non-null int64 13 Developer 1230376 non-null object 14 Developer_Url 1229267 non-null object 15 Developer_Website 586388 non-null object 16 Average_User_Rating 1230376 non-null float64 17 Reviews 1230376 non-null int64 18 Current_Version_Score 1230376 non-null float64 19 Current_Version_Reviews 1230376 non-null int64 dtypes: bool(1), float64(4), int64(3), object(12) memory usage: 179.5+ MB
## dropping more columns to make the data more manageable
ios.drop(columns = ["Size_Bytes", "Required_IOS_Version", "DeveloperId", "Developer", "Developer_Url"], axis=1, inplace = True)
ios.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 1230376 entries, 0 to 1230375 Data columns (total 15 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 App_Name 1230375 non-null object 1 AppStore_Url 1230376 non-null object 2 Primary_Genre 1230376 non-null object 3 Content_Rating 1230376 non-null object 4 Released 1230373 non-null object 5 Updated 1230376 non-null object 6 Version 1230376 non-null object 7 Price 1229886 non-null float64 8 Currency 1230376 non-null object 9 Free 1230376 non-null bool 10 Developer_Website 586388 non-null object 11 Average_User_Rating 1230376 non-null float64 12 Reviews 1230376 non-null int64 13 Current_Version_Score 1230376 non-null float64 14 Current_Version_Reviews 1230376 non-null int64 dtypes: bool(1), float64(3), int64(2), object(9) memory usage: 132.6+ MB
This effectively deleted some columns fromm the Google Playstore dataset so that it will be easier to work with and take less time to run. The files will not be as large.
The code below loops through the datasets, selects only the apps that have the category name "Education," and isolates them. Then it filters through the datast and also only selects the apps with the content rating suitable for everyone. The first code snippet works on the Google Play Store dataset first. This reduces the number of apps in the dataset from 2.3 million to a little over 230 thousand. Selecting only educational apps also reduces the Apple apps dataset from 1.2 million to just a little over 100 thousand.
android_final = android[(android["Category"]=='Education')&(android["Content Rating"]=='Everyone')]
print("Number of rows in educational Google apps suitable for everyone:", len(android_final))
Number of rows in educational Google apps suitable for everyone: 232180
android_final.head(4)
App Name | App Id | Category | Rating | Rating Count | Installs | Minimum Installs | Maximum Installs | Free | Price | ... | Developer Website | Developer Email | Released | Last Updated | Content Rating | Privacy Policy | Ad Supported | In App Purchases | Editors Choice | Scraped Time | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
37 | Calculus Tutorial 1: Introduction | com.RaySemiSoft.CalculusT1 | Education | 0.0 | 0.0 | 100+ | 100.0 | 277 | True | 0.0 | ... | NaN | raysemisoft@gmail.com | Jun 18, 2020 | Jun 01, 2021 | Everyone | NaN | False | False | False | 2021-06-15 20:19:37 |
67 | RACE ACADEMY | co.davos.snqkw | Education | 0.0 | 0.0 | 100+ | 100.0 | 186 | True | 0.0 | ... | NaN | support@classplus.co | Jan 9, 2021 | May 18, 2021 | Everyone | https://bit.ly/2YDTip0 | False | False | False | 2021-06-15 20:19:39 |
72 | Triple Point Academy | co.varys.sinbd | Education | 5.0 | 5.0 | 10+ | 10.0 | 18 | True | 0.0 | ... | NaN | support@classplus.co | Oct 15, 2020 | Jun 13, 2021 | Everyone | https://bit.ly/33pSGFX | False | False | False | 2021-06-15 20:19:39 |
96 | Духовно-нравственная культура (ДНК) | appinventor.ai_moscluster_com.DNK | Education | 0.0 | 0.0 | 100+ | 100.0 | 103 | True | 0.0 | ... | http://www.moscluster.com | 1@moscluster.com | Jul 5, 2017 | Jul 05, 2017 | Everyone | https://www.moscluster.com/?page_id=463 | False | False | False | 2021-06-15 20:19:41 |
4 rows × 24 columns
ios_final = ios[(ios["Primary_Genre"]=='Education') & (ios["Content_Rating"]=='4+')]
print("Number of rows in educational Apple apps suitable for everyone:", len(ios_final))
Number of rows in educational Apple apps suitable for everyone: 106049
ios_final.head(3)
App_Name | AppStore_Url | Primary_Genre | Content_Rating | Released | Updated | Version | Price | Currency | Free | Developer_Website | Average_User_Rating | Reviews | Current_Version_Score | Current_Version_Reviews | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | A+ Paper Guide | https://apps.apple.com/us/app/a-paper-guide/id... | Education | 4+ | 2017-09-28T03:02:41Z | 2018-12-21T21:30:36Z | 1.1.2 | 0.0 | USD | True | NaN | 0.0 | 0 | 0.0 | 0 |
28 | AAB Mobile | https://apps.apple.com/us/app/aab-mobile/id147... | Education | 4+ | 2019-08-29T07:00:00Z | 2019-12-28T20:58:19Z | 1.3 | 0.0 | USD | True | NaN | 5.0 | 1 | 5.0 | 1 |
31 | AAJ Year Book | https://apps.apple.com/us/app/aaj-year-book/id... | Education | 4+ | 2015-11-09T23:24:28Z | 2015-12-08T00:22:26Z | 1.01 | 0.0 | USD | True | http://aaj.edu.jo | 1.0 | 1 | 1.0 | 1 |
The nested pie chart shows that apps in the educational category and graded as suitable for anyone to download, make up less than 10% of either the Google Play Store or the Apple App Store datasets from years 2008-2021. Educational apps consist of 10% of all Google Play Store apps, and 8.62% of all Apply App Store apps. We will use these final filtered datasets for the rest of the data exploration and visualization in this project.
import seaborn as sns
import matplotlib.pyplot as plt
size = 0.3
facecolor = '#eaeaf2'
font_color = '#525252'
labels = ['Other Apps', 'Educational Apps']
vals = [106049, 1124327]
group_sum = [232180, 2080764]
group_name = ["Google",""]
subgroup_name = ["","Apple"]
names=["Edu. Apps", "Other Apps"]
fig, ax = plt.subplots(figsize=(15,10), facecolor=facecolor)
outer_colors = ["#FF0066", "#FFFFD1"]
inner_colors = ["#FF0066", "#FFFFD1"]
ax.pie(group_sum, radius = 1, colors=outer_colors, labels=group_name, textprops={'color':font_color}, wedgeprops = dict(width=size, edgecolor='black'))
ax.pie(vals, radius = 1-size, colors=inner_colors, labels=subgroup_name, wedgeprops=dict(width=size, edgecolor='black'))
ax.set_title("Google vs. Apple Educational Apps as % Total", fontsize=18, pad=15, color=font_color)
plt.legend(names, loc='lower left')
plt.show()
The following code reveals that while all Apple apps have prices ending in 99 cents, Google apps have more creative pricing features. I will select a new dataset of prices ending in 99 and grab all prices between 99 cents and 19.99, incrementing by a dollar. I then will create a bar graph to get a count of how many apps are offered at each price. The first bar graph shows just the Google Play Store dataset. It shows that the most commonly found prices are 4.99 and under, with a majority offered for just 99 cents. The second bar graph is of the Apple App Store dataset and it shows a similar trend, although it has more apps offered at 1.99 and 2.99 than the Google Play Store. Perhaps this is just a reflection of a larger dataset, though. Both bar graphs reveal that above 4.99, the most commonly found prices for apps are 6.99 or 9.99.
The following code snippet reveals all the unique price points in the datasets. Note that the Google Play Store dataset has data dating back to 2010 while the Apple app dataset has data dating back to 2008.
##Google apps prices and their frequency
android_final["Price"].value_counts()
0.000000 225780 0.990000 1265 1.990000 735 2.990000 643 1.490000 489 ... 0.590378 1 12.000000 1 4.880000 1 10.970000 1 18.903596 1 Name: Price, Length: 321, dtype: int64
This revealed that Google apps in the educational category are mostly free. We don't want to look at the free apps for now, so we will isolated just those apps that have a price above zero. We are left wwith 6,400 apps.
## only get Google apps priced above zero
android_final=android_final.loc[android_final["Price"] > 0]
android_final["Price"].value_counts()
0.990000 1265 1.990000 735 2.990000 643 1.490000 489 3.990000 370 ... 0.590378 1 12.000000 1 4.880000 1 10.970000 1 18.903596 1 Name: Price, Length: 320, dtype: int64
print("The number of rows left for Google apps:", len(android_final))
The number of rows left for Google apps: 6400
Let's now compare this with the number of educational apps in Apple that have a price.
#Apple apps' prices and their frequency
ios_final["Price"].value_counts()
0.00 90092 0.99 3692 1.99 3370 2.99 2607 3.99 1508 ... 41.99 1 349.99 1 209.99 1 38.99 1 46.99 1 Name: Price, Length: 75, dtype: int64
The majority of educational Apple apps are also offered free.
## only get Apple apps priced above zero
ios_final=ios_final.loc[ios_final["Price"] > 0]
ios_final["Price"].value_counts()
0.99 3692 1.99 3370 2.99 2607 3.99 1508 4.99 1401 ... 109.99 1 399.99 1 349.99 1 124.99 1 46.99 1 Name: Price, Length: 74, dtype: int64
print("The number of rows left for Apple apps:", len(ios_final))
The number of rows left for Apple apps: 15951
##Create a bar graph to visualize Google dataset prices
import matplotlib.pyplot as plt
import seaborn as sns
plt.figure(figsize=(16,8))
popular_prices = [0.99, 1.99, 2.99, 3.99, 4.99, 5.99, 6.99, 7.99, 8.99, 9.99,11.99, 12.99, 13.99, 14.99, 15.99, 16.99, 17.99, 18.99, 19.99]
android_prices = android_final[android_final["Price"].isin(popular_prices)]
sns.set_style("whitegrid")
sns.countplot(data=android_prices, x="Price", orient="h", palette="pastel")
plt.xticks(rotation=45)
plt.title("Google Play Store's Educational Apps Priced Under $20", fontsize=14)
plt.show()
## Create a bar graph to visualize Apple Apps dataset prices
plt.figure(figsize=(16,8))
ios_prices = ios_final[ios_final["Price"].isin(popular_prices)]
sns.countplot(data=ios_prices, x="Price", orient="h", palette="pastel")
plt.title("Apple App Store's Educational Apps Priced Under $20 Dollars", fontsize=14)
plt.xticks(rotation=45)
plt.show()
To look at price changes over time, I use the "Released" column of both datasets. The "Released" colummn gives the date that the app was released. I look at the price points when the apps were released, over the years 2008-2021. I also look at the change over months in each year, to search for any hidden patterns. First, I need to extract the month and year from the "Released" column. This is difficult to do, because the date is saved as a string in the Google Play Store dataset. I have to first create a pattern that gets matched and returns the year, then the month, from each string, and assigns them, respectively, to new columns titled "Year" and "Month." I then do the same thing for the Apple App Store dataset, which proves easier to do.
##Extract month and year and assign new columns with those values for easy access
pattern = r"([2][0-9]{3})"
years = android_prices["Released"].str.extract(pattern)
android_prices["Year"] = years
##Check the Google Play Store Set for months and that they are extracted
monthpattern = r"([A-Z][a-z]{2})"
month = android_prices["Released"].str.extract(monthpattern)
android_prices["Month"] = month
android_prices["Month"] = pd.Categorical(android_prices["Month"], ['Jan','Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
android_prices.sort_values("Month")
/var/folders/01/8qkcw5d910nfzr664dx7b5j80000gn/T/ipykernel_66540/2407542695.py:5: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy android_prices["Year"] = years /var/folders/01/8qkcw5d910nfzr664dx7b5j80000gn/T/ipykernel_66540/2407542695.py:10: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy android_prices["Month"] = month /var/folders/01/8qkcw5d910nfzr664dx7b5j80000gn/T/ipykernel_66540/2407542695.py:11: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy android_prices["Month"] = pd.Categorical(android_prices["Month"], ['Jan','Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
App Name | App Id | Category | Rating | Rating Count | Installs | Minimum Installs | Maximum Installs | Free | Price | ... | Released | Last Updated | Content Rating | Privacy Policy | Ad Supported | In App Purchases | Editors Choice | Scraped Time | Year | Month | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1275326 | Learn French | com.metalanguage.learnfrench | Education | 4.8 | 10.0 | 100+ | 100.0 | 392 | False | 4.99 | ... | Jan 21, 2016 | Dec 19, 2019 | Everyone | http://metalanguagepro.com/privacy-policy/ | False | False | False | 2021-06-15 21:49:03 | 2016 | Jan |
164134 | [PRO] DPE RJ TECNICO MEDIO 2019 | br.com.concursoprepara.defensoriapublicadoesta... | Education | 0.0 | 0.0 | 10+ | 10.0 | 19 | False | 3.99 | ... | Jan 22, 2019 | Jan 22, 2019 | Everyone | NaN | False | False | False | 2021-06-15 23:02:43 | 2019 | Jan |
1947303 | Praxis II Business Education Exam Prep Flashcards | com.smart.serious.software.app.learn.flashcard... | Education | 0.0 | 0.0 | 10+ | 10.0 | 13 | False | 1.99 | ... | Jan 2, 2019 | Jan 02, 2019 | Everyone | https://smart-apps.flycricket.io/privacy.html | False | False | False | 2021-06-16 07:39:52 | 2019 | Jan |
922263 | おかねかぞえ | nhiraiwa.kids.money | Education | 0.0 | 0.0 | 100+ | 100.0 | 122 | False | 0.99 | ... | Jan 12, 2013 | Jan 30, 2016 | Everyone | NaN | False | False | False | 2021-06-16 11:06:16 | 2013 | Jan |
921974 | 21 Courageous Prayers | org.jeffmikels.courageousprayers | Education | 0.0 | 0.0 | 10+ | 10.0 | 40 | False | 0.99 | ... | Jan 3, 2019 | Jan 03, 2019 | Everyone | NaN | False | False | False | 2021-06-16 11:06:00 | 2019 | Jan |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
1725390 | Mnemocon Cards - обучение английскому по карто... | ru.mnemocon.cards.mnemoconcards | Education | 4.7 | 17.0 | 100+ | 100.0 | 316 | False | 0.99 | ... | NaN | Jun 02, 2018 | Everyone | https://sites.google.com/view/mnemocon | False | False | False | 2021-06-16 04:31:43 | NaN | NaN |
1907502 | Hiragana/Katakana Drill Pro | org.muth.android.kana | Education | 4.3 | 70.0 | 1,000+ | 1000.0 | 1268 | False | 2.99 | ... | NaN | Nov 21, 2015 | Everyone | NaN | False | False | False | 2021-06-16 07:04:55 | NaN | NaN |
1953197 | German Verbs Pro | org.muth.android.conjugator_pro_de | Education | 4.5 | 246.0 | 1,000+ | 1000.0 | 4582 | False | 4.99 | ... | NaN | Jul 17, 2019 | Everyone | NaN | False | False | False | 2021-06-16 07:45:03 | NaN | NaN |
2050133 | AnyMemo Pro: For Donation | org.liberty.android.fantastischmemopro | Education | 4.5 | 151.0 | 1,000+ | 1000.0 | 1982 | False | 1.99 | ... | NaN | Aug 08, 2020 | Everyone | https://anymemo.org/privacy-policy-view | False | False | False | 2021-06-16 09:09:59 | NaN | NaN |
2262468 | CompTIA Server+ Exam Prep | com.dblpartners.comptia_server | Education | 0.0 | 0.0 | 100+ | 100.0 | 223 | False | 4.99 | ... | NaN | Oct 28, 2019 | Everyone | http://dynamicpath.com/privacy | False | False | False | 2021-06-16 12:14:53 | NaN | NaN |
4148 rows × 26 columns
##The years are stored as strings so I have to convert to float
android_prices['Year'] = android_prices['Year'].astype(float)
##Select years before 2015
android_priceyear = android_prices.loc[android_prices["Year"] < 2015]
/var/folders/01/8qkcw5d910nfzr664dx7b5j80000gn/T/ipykernel_66540/1971590246.py:2: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy android_prices['Year'] = android_prices['Year'].astype(float)
The graphs below are analyzed for any patterns. Both the Google Play Store and the Apple App Stores' line graphs reveal one major pattern: both exhibit price peaks once or twice per year. Generally, the trend is that higher prices are offered in February-March and/or September-October. The prices are low in the summer and December-January. This makes sense, because teachers would be looking for educational apps during the back-to-school season in the fall, and right before the spring semester, in order to finish the academic year strong with fresh inspiration. Notice that in the Google Play Store's educational apps from years 2010-2014, many of the price peaks are between 6-8 dollars, which is higher than in years following.
plt.figure(figsize=(16,8))
sns.set_style("whitegrid")
sns.lineplot(data=android_priceyear, x="Month", y="Price", hue="Year", palette='bright')
plt.xticks(rotation=45)
plt.title("Google Play Store's Educational Apps' Price Trend over Yrs. 2010-2014", fontsize=14)
plt.show()
android_secpriceyearseg = android_prices.loc[(android_prices["Year"] < 2020) &(android_prices["Year"] > 2014)]
len(android_secpriceyearseg)
2465
plt.figure(figsize=(16,8))
sns.set_style("whitegrid")
sns.lineplot(data=android_secpriceyearseg, x="Month", y="Price", hue="Year", palette='bright')
plt.xticks(rotation=45)
plt.title("Google Play Store's Educational Apps' Price Trend over Yrs. 2015-2019", fontsize=14)
plt.show()
android_pandemicyrs = android_prices.loc[android_prices["Year"] >= 2019]
len(android_pandemicyrs)
1143
The Covid years on the Google Play Store reveal a sharp peak in prices during March 2021, otherwise prices were fairly stable throughout 2020.
plt.figure(figsize=(16,8))
sns.set_style("whitegrid")
sns.lineplot(data=android_pandemicyrs, x="Month", y="Price", hue="Year", palette='bright')
plt.xticks(rotation=45)
plt.title("Google Play Store's Educational Apps' Price Trend over Covid Pandemic Yrs. 2019-2021", fontsize=14)
plt.show()
Below I start to examine the Apple App Store dataset. The Apple App Store dataset is a little different from the Google Play Store dataset, in that the data started two years earlier, in 2008. I isolate these early years in the first graph. We can see that the prices started out high in 2008 and immediately dropped in early 2009. By 2010, the prices stabilized.
## extract the year and month when app was released and add a column for month and year to dataframe
from datetime import datetime
ios_prices["Month"] =pd.DatetimeIndex(ios_prices['Released']).month
ios_prices["Year"]=pd.DatetimeIndex(ios_prices['Released']).year
/var/folders/01/8qkcw5d910nfzr664dx7b5j80000gn/T/ipykernel_66540/2148628083.py:3: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy ios_prices["Month"] =pd.DatetimeIndex(ios_prices['Released']).month /var/folders/01/8qkcw5d910nfzr664dx7b5j80000gn/T/ipykernel_66540/2148628083.py:4: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy ios_prices["Year"]=pd.DatetimeIndex(ios_prices['Released']).year
## get Apple apps priced between 2008 and 2009
ios_priceyearseg1 = ios_prices.loc[ios_prices["Year"] < 2011]
len(ios_priceyearseg1)
714
plt.figure(figsize=(16,8))
sns.set_style("whitegrid")
sns.relplot(data=ios_priceyearseg1, x="Month", y="Price", hue="Year", height=7, aspect=2, kind="line", palette='bright')
plt.xticks(rotation=45)
plt.title("Apple App Store's Educational Apps' Price Trend over Yrs. 2008-2010", fontsize=14)
plt.show()
<Figure size 1152x576 with 0 Axes>
The Apple App store prices from 2010-2014 reveal that prices got higher than in 2010. Notice that the prices later in this period eventually got higher than most years prior, especially in 2014 the prices offered were generally higher than years prior. 2012 was when this trend began. Still, the general trend is that the Google Play Store's price peaks were at higher prices than found in the Apple App Store over the same time period. Most of the price peaks in the Google Play Store were between 6 and 8 dollars, whereas in the Apple App store, they are between 4 and 5 dollars.
ios_secpriceyearseg = ios_prices.loc[(ios_prices["Year"] < 2015) &(ios_prices["Year"] > 2009)]
plt.figure(figsize=(16,8))
sns.set_style("whitegrid")
sns.relplot(data=ios_secpriceyearseg, x="Month", y="Price", hue="Year", height=6, aspect=2, kind="line", palette='bright')
plt.xticks(rotation=45)
plt.title("Apple App Store's Educational Apps' Price Trend over Yrs. 2010-2014", fontsize=14)
plt.show()
<Figure size 1152x576 with 0 Axes>
The graph below showss that during the years 2015-2019, Apple price peaks were still between 4 and 5 dollars. However, during the same time period, Google price peaks were mostly between 5 andd 6 dollars, which came down from between 6-8 dollars offered during 2010-2014 in the Google Play Store. Thus, the Apple App Store prices prove to be more stable over time than the Google Play Store. Still, we can see some mvery high price peaks in the Apple App stoe, such as during October 2019, when prices swung towards 7 dollars.
ios_thirdpriceyearseg = ios_prices.loc[(ios_prices["Year"] < 2020) &(ios_prices["Year"] > 2014)]
plt.figure(figsize=(16,8))
sns.set_style("whitegrid")
sns.relplot(data=ios_thirdpriceyearseg, x="Month", y="Price", hue="Year", height=6, aspect=2, kind="line", palette='bright')
plt.title("Apple App Store's Educational Apps' Price Trend over Yrs. 2015-2019", fontsize=14)
plt.show()
<Figure size 1152x576 with 0 Axes>
ios_pandemicyrs = ios_prices.loc[ios_prices["Year"] >= 2019]
len(ios_pandemicyrs)
3996
The graph below shows that the Apple App store prices wwere very volatile during the pandemic years, more so than the Google Play Store. The peaks were during October in 2020 and February in 2021.
plt.figure(figsize=(16,8))
sns.set_style("whitegrid")
sns.lineplot(data=ios_pandemicyrs, x="Month", y="Price", hue="Year", palette='bright')
plt.title("Apple App Store's Educational Apps' Price Trend over Yrs. 2019-2021", fontsize=14)
plt.show()
Prices across Apple and Google platforms show some common overall trends, such as most common prices falling between 99 cents and 2.99. Also, both platforms exhibit price peaks during Feb-March and Sept-Oct. However, prices peaks were generally lower on the Apple App Store (4 to 5 dollars) vs. the Google Play Store (5 to 8 dollars). Apple Store pricing, despite being more stable long-term, exhibited higher peak pricing during the Covid pandemic. Both datasets show prices generally falling between 1 and 8 dollars for educational apps.
This sections looks at the trends in average user ratings. The user ratings analysis is conducted over the whole dataset of educational apps, both free and paid apps. It looks like the average user ratings for the Google Play Store apps have the same problem as the pricing: the majority are in the zero column. I will select a new dataset of just the ratings above zero and do this bar chart again to be able to see the trends in the rest of the data more clearly.
plt.figure(figsize=(16,8))
sns.set_style("whitegrid")
sns.countplot(data=android_final, x="Rating", orient="h", palette="pastel")
plt.title("Google Play Store's Educational App Ratings (Paid and Free Apps)")
plt.xticks(rotation=75, fontsize=12)
plt.show()
The graph reveals that the majority of ratings for the Google Play Store's rated educational apps fall between a rating of 4.2 and 4.7.
android_goodratings = android_final.loc[android_final["Rating"]>0.0]
plt.figure(figsize=(16,8))
sns.set_style("whitegrid")
sns.countplot(data=android_goodratings, x="Rating", orient="h", palette="pastel")
plt.title("Google Play Store's Educational App Ratings (Paid & Free Apps)", fontsize=12)
plt.xticks(rotation=45, fontsize=12)
plt.show()
That is better! Now we can visually see very clearly that after a rating of zero being the most common, the next bracket of ratings which is most common on the Google Play Store for educational apps is between 4.2 and 4.7. An educational app should benchmark to get these scores. Below, I try the same for the Apple App Store, and run into the same problem that the majority of apps receive no ratings.
Once I remove the zero ratings for the Apple App dataset, the result shows that the Apple Store is unique in that users like to give ratings that are whole numbers. Ratings for whole numbers in the Apple dataset like 3.0, 2.0, and 4.0 are seen more frequently than the Google Play Store's ratings between 4.2 and 4.7.
However, the most common rating in the Apple dataset are 5, revealing a higher chance of receiving a higher user rating on the Apple App store. After all the whole numbers are accounted for, lastly we see that users are more likely to give ratings between 4 and 5 at different decimal places, too. For the Apple App Store, I would benchmark an educational app to go for a 5.0 rating, given the greater chance of achieving that.
If user ratings on the Apple app store can be advertised on the Google Play Store when introducing the app there, this might be a good strategy.
# round the ratings to one decimal place
goodrating = ios_final["Average_User_Rating"].round(decimals=1)
ios_final["Average_User_Rating"] = goodrating
#zero is still the most common rating in the Apple App Store
plt.figure(figsize=(16,8))
sns.set_style("whitegrid")
sns.countplot(data=ios_final, x="Average_User_Rating", orient="h", palette="pastel")
plt.title("Apple App Store's Educational App Ratings (Paid & Free Apps)")
plt.xticks(rotation=45, fontsize=12)
plt.show()
#round the Apple app store ratings to one decimal place
goodrating = ios_final["Average_User_Rating"].round(decimals=1)
ios_final["Average_User_Rating"] = goodrating
#filter for only ratings above zero
ios_goodratings = ios_final.loc[ios_final["Average_User_Rating"]> 0.0]
plt.figure(figsize=(16,8))
sns.set_style("whitegrid")
sns.countplot(data=ios_goodratings, x="Average_User_Rating", orient="h", palette="pastel")
plt.title("Apple App Store's Educational App User Ratings (Paid & Free Apps)")
plt.xticks(rotation=45, fontsize=12)
plt.show()
This is an interesting trend that over the years, the density of user ratings is tending to increase in the higher ratings between 4 and 5 stars on both the Apple App Store and the Google Play Store (notice the tear drop shapes are getting more round at the top). This indicates that users are more likely to rate an app in the higher 4.5 - 4.9 range than in the lower 4.0 - 4.4 range, with time. This trend is happening across both platforms. However, as also seen above, the Apple App store has wider tails, revealing more frequent occurence of lower user ratings.
pattern = r"([2][0-9]{3})"
years = android_goodratings["Released"].str.extract(pattern)
android_goodratings["Year"] = years
monthpattern = r"([A-Z][a-z]{2})"
month = android_goodratings["Released"].str.extract(monthpattern)
android_goodratings["Month"] = month
android_goodratings["Month"] = pd.Categorical(android_goodratings["Month"], ['Jan','Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
android_goodratings.sort_values("Month")
android_goodratings["Year"] = pd.Categorical(android_goodratings["Year"], ['2010', '2011', '2012', '2013', '2014', '2015', '2016', '2017', '2018', '2019', '2020', '2021'])
android_goodratings.sort_values("Year")
plt.figure(figsize=(16,8))
sns.set_style("whitegrid")
sns.violinplot(data=android_goodratings, x='Year', y='Rating', palette='pastel')
plt.title("Google Average User Ratings Over Years 2010-2021 for Paid & Free Educational Apps")
plt.xticks(fontsize=12)
plt.show()
/var/folders/01/8qkcw5d910nfzr664dx7b5j80000gn/T/ipykernel_66540/605555549.py:3: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy android_goodratings["Year"] = years /var/folders/01/8qkcw5d910nfzr664dx7b5j80000gn/T/ipykernel_66540/605555549.py:7: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy android_goodratings["Month"] = month /var/folders/01/8qkcw5d910nfzr664dx7b5j80000gn/T/ipykernel_66540/605555549.py:8: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy android_goodratings["Month"] = pd.Categorical(android_goodratings["Month"], ['Jan','Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec']) /var/folders/01/8qkcw5d910nfzr664dx7b5j80000gn/T/ipykernel_66540/605555549.py:10: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy android_goodratings["Year"] = pd.Categorical(android_goodratings["Year"], ['2010', '2011', '2012', '2013', '2014', '2015', '2016', '2017', '2018', '2019', '2020', '2021'])
import pandas as pd
#.round(decimals=1)
#ios_goodratings['Average_User_Rating'] = ios_final.loc[ios_final["Average_User_Rating"]> 0.0]
#goodrating = ios_final.loc[ios_final["Average_User_Rating"].round(decimals=1)
#ios_goodratings["Average_User_Rating"] = ios_goodratings['Average_User_Rating'].round(decimals=1)
#filter for only ratings above zero
#from datetime import datetime
ios_goodratings['Year'] = ios_goodratings["Released"].astype(str).str[:4]
#ios_goodratings["Year"]=pd.DatetimeIndex(ios_goodratings['Released']).year
ios_goodratings["Year"] = pd.Categorical(ios_goodratings["Year"], ['2008', '2009','2010', '2011', '2012', '2013', '2014', '2015', '2016', '2017', '2018', '2019', '2020', '2021'])
ios_goodratings.sort_values("Year")
#round to one decimal place
plt.figure(figsize=(15,8))
sns.set_style("whitegrid")
sns.violinplot(data=ios_goodratings, x='Year', y='Average_User_Rating', palette='pastel', s=10)
plt.title("Apple Average User Ratings Over Years 2010-2021 for Paid & Free Educational Apps")
plt.yticks(ticks=[1.0,2.0,3.0,4.0,5.0], labels=[1,2,3,4,5])
plt.xticks(fontsize=12)
plt.show()
/var/folders/01/8qkcw5d910nfzr664dx7b5j80000gn/T/ipykernel_66540/3624179786.py:9: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy ios_goodratings['Year'] = ios_goodratings["Released"].astype(str).str[:4] /var/folders/01/8qkcw5d910nfzr664dx7b5j80000gn/T/ipykernel_66540/3624179786.py:11: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy ios_goodratings["Year"] = pd.Categorical(ios_goodratings["Year"], ['2008', '2009','2010', '2011', '2012', '2013', '2014', '2015', '2016', '2017', '2018', '2019', '2020', '2021'])
The categorical charts below show that educational apps priced below 4.99 on both the Apple App Store and the Google Play Store receive the highest ratings but also have the longest tails, indicating their higher frequency of occurrence, too. Further down, I separate out the data by years for better viewing. Apps priced at 6.99 on the Google Play Store have very high ratings without a long tail, while apps priced at 6.99 on the Apple App Store do not have such a high concentration of ratings between 4 and 5. Apps priced between 3.99 and 4.99 appear to have higher concentrated ratings between 3 and 5, however it remains undetermined whether this is because those prices occur less frequently. Below are the frequency counts per price again, for an easy comparison.
plt.figure(figsize=(15,8))
sns.set_style("whitegrid")
sns.set(font_scale=2)
sns.catplot(data=android_prices, x='Price', y='Rating',hue='Year', height=10, aspect=2, palette="bright",s=10)
plt.title("Google Educational Apps: Price vs. User Rating")
plt.xticks(rotation=45)
plt.show()
<Figure size 1080x576 with 0 Axes>
sns.set_style("whitegrid")
sns.catplot(data=ios_prices, x="Price", y="Average_User_Rating",hue='Year', height=10, aspect=2, palette='bright', s=10)
plt.title("Apple Educational Apps: Price vs. User Rating")
plt.xticks(rotation=45)
plt.show()
Apps priced between 3.99 and 4.99 appear to have higher concentrated ratings between 3 and 5, however it remains undetermined whether this is because those prices occur less frequently. Below are the frequency counts per price again, for an easy comparison.
# total number of Apple apps priced at 99 cents, 1.99, or 2.99
ios_prices["Price"].value_counts().loc[[0.99, 1.99, 2.99]].sum()
9669
## total number of Apple apps priced at 3.99 or 4.99
ios_prices["Price"].value_counts().loc[[3.99, 4.99]].sum()
2909
## the total number of Google apps priced at 99 cents, 1.99, and 2.99
android_prices["Price"].value_counts().loc[[0.99,1.99, 2.99]].sum()
2643
## the total number of Google apps priced at 3.99 and 4.99
android_prices['Price'].value_counts().loc[[3.99, 4.99]].sum()
728
ios_prices["Price"].describe()
count 15167.000000 mean 3.940551 std 3.927261 min 0.990000 25% 1.990000 50% 2.990000 75% 4.990000 max 19.990000 Name: Price, dtype: float64
android_prices["Price"].describe()
count 4148.000000 mean 3.767724 std 3.616105 min 0.990000 25% 0.990000 50% 2.990000 75% 4.990000 max 19.990000 Name: Price, dtype: float64
sns.set_style("whitegrid")
sns.catplot(data=ios_prices, x="Price", y="Average_User_Rating",hue='Year', height=10, aspect=2, palette='bright', col='Year',col_wrap=2,s=10)
plt.suptitle("Apple Educational Apps: Price vs. User Rating Per Year", x=0.5, y=1.0, fontsize=52)
fig.subplots_adjust(top=0.8)
plt.xticks(rotation=45)
plt.show()
plt.figure(figsize=(15,8))
sns.set_style("whitegrid")
sns.set(font_scale=2)
sns.catplot(data=android_prices, x='Price', y='Rating',hue='Year', height=10, aspect=2, palette="bright", col='Year', col_wrap=2, s=10)
plt.suptitle("Google Educational Apps: Price vs. User Rating Per Year", x=0.5, y=1.0, fontsize=52)
fig.subplots_adjust(top=0.8)
plt.xticks(rotation=45)
plt.show()
<Figure size 1080x576 with 0 Axes>
Only the Google Play Store dataset gives the number of installs, but we will examine them even though we cannot compare them with the Apple App Store installs. From the Google Play Store, we can see the same trend for both paid and free apps: the number of installs has drastically declined, marking a fiercely competitive market in mobile for educational apps. There was a brief peak in the year 2014, which did not correlate with higher prices on the Google Play Store, but the year 2014 did see generally higher prices in the Apple App Store, a different dataset. It does appear that educational app companies in Google kept a low price point in order to capture market share rather than raise prices with the increase in number of installs.
plt.figure(figsize=(15,8))
sns.set_style("whitegrid")
sns.lineplot(data=android_prices, x='Year', y='Minimum Installs',palette='pastel')
plt.xticks(rotation=45)
plt.title("Google Play Store's Ed. App Installs per Release Year (Paid Apps Only)")
Text(0.5, 1.0, "Google Play Store's Ed. App Installs per Release Year (Paid Apps Only)")
plt.figure(figsize=(15,8))
sns.set_style("whitegrid")
pattern = r"([2][0-9]{3})"
years = android_final["Released"].str.extract(pattern)
android_final["Year"] = years
android_final["Year"] = pd.Categorical(android_final["Year"], ['2010', '2011', '2012', '2013', '2014', '2015', '2016', '2017', '2018', '2019', '2020', '2021'])
android_final.sort_values("Year")
sns.lineplot(data=android_final, x='Year', y='Minimum Installs',palette='pastel')
plt.xticks(rotation=45)
plt.title("Google Play Store's Ed. App Installs per Release Year (Paid and Free Apps)")
Text(0.5, 1.0, "Google Play Store's Ed. App Installs per Release Year (Paid and Free Apps)")
The last graph shows that the price point at 3.99 in the Google Play Store got 5 times more installs than any other price point. This is very useful information. Surprisingly, the higher price point of 8.99 also received as many installs as the lower, more frequent price points of 0.99, 1.99, 2.99 and 4.99. Lastly, we can see that the price points 5.99 and 6.99 see a similar number of installs, indicating that customers do not differentiate much between those two prices.
plt.figure(figsize=(15,8))
sns.set_style("whitegrid")
sns.violinplot(data=android_prices, x='Price', y='Minimum Installs', kind='boxen')
plt.title("Educational Google Apps Prices vs. Minimum Install Number")
plt.xticks(rotation=45)
(array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18]), [Text(0, 0, '0.99'), Text(1, 0, '1.99'), Text(2, 0, '2.99'), Text(3, 0, '3.99'), Text(4, 0, '4.99'), Text(5, 0, '5.99'), Text(6, 0, '6.99'), Text(7, 0, '7.99'), Text(8, 0, '8.99'), Text(9, 0, '9.99'), Text(10, 0, '11.99'), Text(11, 0, '12.99'), Text(12, 0, '13.99'), Text(13, 0, '14.99'), Text(14, 0, '15.99'), Text(15, 0, '16.99'), Text(16, 0, '17.99'), Text(17, 0, '18.99'), Text(18, 0, '19.99')])
I would release an educatioanl app on the Apple App Store first, with a high goal of getting a 5.0 average user rating, since ratings are more likely to be given in whole numbers and achieving a 5.0 occurs more frequently on the Apple platform. Given the trend that user ratings are more likely to favor the upper 4.0 range than in years priors, I would take the risk of releasing on the Apple App Store, given the Apple App Store's overall lower ratings in the 4.2-4.7 range than the Google Play Store. If I achieve a 5.0 user rating on the Apple App Store, I would advertise this later on the Google Play Store for my app. I would benchmark that an educational should receive an average user rating between 4.2 and 4.7 to be on target.
I would initialy price at 8.99 on the Google platform, to capture the willingness to pay higher prices on the Google platform. I would then decrease the price to 3.99 to capture the higher number of installs at that price point, since apps priced at 3.99 are five times more likely to be installed than other price points. The peak prices after that can be offered once or twice per year, at maximum, to follow the competition, but pricing not more than 6.99 is always the ideal. Once I lowered the price to 3.99, I would stay at that price most of the year to take advantage of the higher user ratings at the lower price point. 3.99 is optimal because it can be lowered to 2.99 or 1.99 on promotion and still receive the benefit of an expectation of a greater proportion of high user ratings between 4 and 5 stars.
Finally, I would benchmark an app as receiving 500 installs per first year as very excellent. Since 2017, the market can count on less than 1,000 installs per release year. Combining the app with a desktop platform and bundling an app offering is more ideal than just relying on releasing an app alone, given the very competitive market on mobile.