Problem 1

Import NumPy under the alias np.

In [1]:
import numpy as np

Problem 2

Import pandas under the alias pd.

In [2]:
import pandas as pd

Problem 3

Given the pandas Series my_series, generate a NumPy array that contains only the unique values from my_series. Assign this new array to a variable called my_array. Print my_array to ensure that the operation has been executed successfully.

In [3]:
my_series = pd.Series([1,1,2,2,3,3,4,4,5,5,6,6,7,7,8,8,9,9])
my_series
Out[3]:
0     1
1     1
2     2
3     2
4     3
5     3
6     4
7     4
8     5
9     5
10    6
11    6
12    7
13    7
14    8
15    8
16    9
17    9
dtype: int64
In [4]:
#Solution goes here
my_array = my_series.unique()
my_array
Out[4]:
array([1, 2, 3, 4, 5, 6, 7, 8, 9])

Problem 4

Given the pandas DataFrame my_data_frame, generate a NumPy array that contains only the unique values from the second column. Assign this new array to a variable called another_array. Print another_array to ensure the operation has been executed successfully.

In [5]:
my_data_frame = pd.DataFrame(np.random.randn(3,5))
my_data_frame
Out[5]:
0 1 2 3 4
0 0.950120 1.104541 -0.135333 -2.157449 -1.786119
1 -1.772171 0.207613 -1.480314 0.191361 -2.296765
2 -0.576407 -0.615181 1.233100 0.092227 -1.881353
In [6]:
#Solution goes here
another_array = my_data_frame[0].unique()
another_array
Out[6]:
array([ 0.95011976, -1.7721715 , -0.57640705])

Problem 5

Count the occurence of every element within the my_series variable that was created earlier in these practice problems.

In [7]:
my_series.value_counts()
Out[7]:
9    2
8    2
7    2
6    2
5    2
4    2
3    2
2    2
1    2
dtype: int64

Problem 6

Given the function triple_digit, apply this to every element within my_series.

In [8]:
def triple_digit(x):
    return x + x*10 + x*100
In [9]:
#Solution goes here
my_series.apply(triple_digit)
Out[9]:
0     111
1     111
2     222
3     222
4     333
5     333
6     444
7     444
8     555
9     555
10    666
11    666
12    777
13    777
14    888
15    888
16    999
17    999
dtype: int64

Problem 7

Sort the my_data_frame variable that we created earlier based on the contents of its second column.

In [10]:
my_data_frame.sort_values(0)
Out[10]:
0 1 2 3 4
1 -1.772171 0.207613 -1.480314 0.191361 -2.296765
2 -0.576407 -0.615181 1.233100 0.092227 -1.881353
0 0.950120 1.104541 -0.135333 -2.157449 -1.786119