Problem 1¶

Import NumPy under the alias np.

In [1]:

import numpy as np

Problem 2¶

Import pandas under the alias pd.

In [2]:

import pandas as pd

Problem 3¶

Given the pandas Series my_series, generate a NumPy array that contains only the unique values from my_series. Assign this new array to a variable called my_array. Print my_array to ensure that the operation has been executed successfully.

In [3]:

my_series = pd.Series([1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9])
my_series

Out[3]:

0     1
1     1
2     2
3     2
4     3
5     3
6     4
7     4
8     5
9     5
10    6
11    6
12    7
13    7
14    8
15    8
16    9
17    9
dtype: int64

In [4]:

#Solution goes here
my_array = my_series.unique()
my_array

Out[4]:

array([1, 2, 3, 4, 5, 6, 7, 8, 9])

Problem 4¶

Given the pandas DataFrame my_data_frame, generate a NumPy array that contains only the unique values from the second column. Assign this new array to a variable called another_array. Print another_array to ensure the operation has been executed successfully.

In [5]:

my_data_frame = pd.DataFrame(np.random.randn(3,5))
my_data_frame

Out[5]:

	0	1	2	3	4
0	0.950120	1.104541	-0.135333	-2.157449	-1.786119
1	-1.772171	0.207613	-1.480314	0.191361	-2.296765
2	-0.576407	-0.615181	1.233100	0.092227	-1.881353

In [6]:

#Solution goes here
another_array = my_data_frame[0].unique()
another_array

Out[6]:

array([ 0.95011976, -1.7721715 , -0.57640705])

Problem 5¶

Count the occurence of every element within the my_series variable that was created earlier in these practice problems.

In [7]:

my_series.value_counts()

Out[7]:

9    2
8    2
7    2
6    2
5    2
4    2
3    2
2    2
1    2
dtype: int64

Problem 6¶

Given the function triple_digit, apply this to every element within my_series.

In [8]:

def triple_digit(x):
    return x + x*10 + x*100

In [9]:

#Solution goes here
my_series.apply(triple_digit)

Out[9]:

0     111
1     111
2     222
3     222
4     333
5     333
6     444
7     444
8     555
9     555
10    666
11    666
12    777
13    777
14    888
15    888
16    999
17    999
dtype: int64

Problem 7¶

Sort the my_data_frame variable that we created earlier based on the contents of its second column.

In [10]:

my_data_frame.sort_values(0)

Out[10]:

	0	1	2	3	4
1	-1.772171	0.207613	-1.480314	0.191361	-2.296765
2	-0.576407	-0.615181	1.233100	0.092227	-1.881353
0	0.950120	1.104541	-0.135333	-2.157449	-1.786119