from google.colab import drive
data_dir = '/data/My Drive/Colab Notebooks/Experiment'
!ls '/data/My Drive/Colab Notebooks/Experiment'
!pip install matplotlib
import pandas as pd
man = pd.read_csv(data_dir+'/m_data.csv')
woman = pd.read_csv(data_dir+'/w_data.csv')
data = pd.concat([man, woman])
import matplotlib.pyplot as plt
plt.scatter(man['bmi'], man['steps'], woman['bmi'], woman['steps'])
z = data.sample(n=500)
plt.scatter(z['bmi'], z['steps'], marker=None)
Unnamed: 0 | bmi | steps | |
0 | 22393 | 14.0 | 217.0 |
1 | 16685 | 171.0 | 176.0 |
2 | 15155 | 91.0 | 168.0 |
3 | 6162 | 86.0 | 101.0 |
4 | 22150 | 146.0 | 215.0 |
Unnamed: 0 | bmi | steps | |
count | 18548.000000 | 18548.000000 | 18548.000000 |
mean | 23180.819388 | 254.411096 | 219.864568 |
std | 13357.077654 | 174.845136 | 102.026507 |
min | 2.000000 | 0.000000 | 0.000000 |
25% | 11642.750000 | 97.000000 | 145.000000 |
50% | 23307.000000 | 202.000000 | 222.000000 |
75% | 34706.250000 | 441.250000 | 292.000000 |
max | 46368.000000 | 549.000000 | 411.000000 |
Unnamed: 0 | bmi | steps | |
count | 18548.000000 | 18548.000000 | 18548.000000 |
mean | 23282.573647 | 254.232909 | 220.864406 |
std | 13439.323095 | 174.901153 | 102.534039 |
min | 1.000000 | 0.000000 | 0.000000 |
25% | 11537.250000 | 97.000000 | 144.000000 |
50% | 23214.000000 | 200.000000 | 222.000000 |
75% | 35035.750000 | 442.000000 | 296.000000 |
max | 46370.000000 | 549.000000 | 411.000000 |
As we can see from the above describtion of 2 data frame, both man and woman's average steps are very similar. Man has an average of 219.86 steps and woman has a 220.86 steps.
For the data in Man and Woman, we can see that it also has a really close standaed deviation. From these 2 tables of data, we can tell that both man and woman have a similar walking pattern.
man[['bmi', 'steps']].corr()
bmi | steps | |
bmi | 1.000000 | -0.099439 |
steps | -0.099439 | 1.000000 |
woman[['bmi', 'steps']].corr()
bmi | steps | |
bmi | 1.0000 | -0.0956 |
steps | -0.0956 | 1.0000 |
man[['bmi', 'steps']].hist()
woman[['bmi', 'steps']].hist()
