Visualizing Linear Regression Model in Python

Visualize a Simple Linear Model

Goal: visualzie the relationship between the topsoil lead concentration (lead column, as y-axis) and the topsoil cadmium concentration (cadmium column, as x-axis).

In [2]:
### Previous steps necessary
# import packages
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
# import dataset
data = pd.read_csv("meuse.csv")
# build the model
regression_model = LinearRegression()
lr = LinearRegression().fit(data.cadmium.reshape((-1, 1)), data.lead)
/Users/lizhoufan/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:10: FutureWarning: reshape is deprecated and will raise in a subsequent release. Please use .values.reshape(...) instead
  # Remove the CWD from sys.path while we load stuff.

We use the matplotlib package to visualize:

In [4]:
# reference: https://becominghuman.ai/implementing-and-visualizing-linear-regression-in-python-with-scikit-learn-a073768dc688
import matplotlib.pyplot as plt
plt.scatter(data.cadmium.reshape((-1, 1)),data.lead, color = "red")
plt.plot(data.cadmium.reshape((-1, 1)), lr.predict(data.cadmium.reshape((-1, 1))), color = "green")
plt.title("Lead vs Cadmium")
plt.xlabel("Cadmium")
plt.ylabel("Lead")
plt.show()
/Users/lizhoufan/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:2: FutureWarning: reshape is deprecated and will raise in a subsequent release. Please use .values.reshape(...) instead
  
/Users/lizhoufan/anaconda3/lib/python3.6/site-packages/ipykernel_launcher.py:3: FutureWarning: reshape is deprecated and will raise in a subsequent release. Please use .values.reshape(...) instead
  This is separate from the ipykernel package so we can avoid doing imports until

Please follow the next post on how do we analyze the model.