Import NumPy under the alias np
.
import numpy as np
Import pandas under the alias pd
.
import pandas as pd
We will again be using salesperson data to test your knowledge of the groupby
method. Given the dataset data
, print a new DataFrame that shows the mean sales per salesperson, grouped by Organization
.
data = pd.DataFrame([ ['Coca-Cola', 'Nick', 200],
['Coca-Cola', 'Joel', 120],
['Pepsi','Taylor', 125],
['Pepsi','Josiah', 250],
['Dr. Pepper','Josh', 150],
['Dr. Pepper','Micaiah', 500]],
columns = ['Organization', 'Salesperson Name', 'Sales'])
data
Organization | Salesperson Name | Sales | |
---|---|---|---|
0 | Coca-Cola | Nick | 200 |
1 | Coca-Cola | Joel | 120 |
2 | Pepsi | Taylor | 125 |
3 | Pepsi | Josiah | 250 |
4 | Dr. Pepper | Josh | 150 |
5 | Dr. Pepper | Micaiah | 500 |
#Solution goes here
data.groupby('Organization').mean()
Sales | |
---|---|
Organization | |
Coca-Cola | 160.0 |
Dr. Pepper | 325.0 |
Pepsi | 187.5 |
Given the dataset data
, print a new DataFrame that shows the total sales for each Organization
.
data.groupby('Organization').sum()
Sales | |
---|---|
Organization | |
Coca-Cola | 320 |
Dr. Pepper | 650 |
Pepsi | 375 |
Given the dataset data, print a new DataFrame that applies the describe method to each organization.
data.groupby('Organization').describe()
Sales | ||||||||
---|---|---|---|---|---|---|---|---|
count | mean | std | min | 25% | 50% | 75% | max | |
Organization | ||||||||
Coca-Cola | 2.0 | 160.0 | 56.568542 | 120.0 | 140.00 | 160.0 | 180.00 | 200.0 |
Dr. Pepper | 2.0 | 325.0 | 247.487373 | 150.0 | 237.50 | 325.0 | 412.50 | 500.0 |
Pepsi | 2.0 | 187.5 | 88.388348 | 125.0 | 156.25 | 187.5 | 218.75 | 250.0 |