In supervised learning, regression is the modelling and prediction of continuous variables. One of the basic techniques of regression is Linear Regression.
Linear regression is usually among the first few topics in learning predictive modeling. Here, the dependent variable (or target variable) is continuous while the independent variables (or predictors) can be continuous or discrete. The nature of regression line is linear, meaning it can be graphically represented as a line on the graph.
Linear Regression establishes a relationship between dependent variable (Y) and one or more independent variables (X) using a best fit straight line.
It is represented by an equation Y=a*X + b, where b is intercept, a is slope of the line. This equation can be used to predict the value of target variable based on given predictor variable(s).
In Machine Learning using python, We can use couple of libraries for performing Linear Regression. One of them is statsmodels. This library has basic modules to perform Ordinary Least Square technique of Linear Regression. Alternatively, we could use another library known as sklearn (Scikit-Learn). This provides python implementations of various Machine Learning algorithms.
In this demonstration notebook we would see an example of both a statsmodels based algorithm as well as a sklearn based model.
Import the required libraries and initialize the code. Here we will be using:
Pandas, Numpy for data manipulation and arrays
Matplotlib, Seaborn, Pyplot and BQplot for visualizations
Sklearn and Statsmodels for Machine Learning algorithms and statistics.
And some additional helpers....
# %load ../standard_import.txt
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import axes3d
import seaborn as sns
from sklearn.preprocessing import scale
import sklearn.linear_model as skl_lm
from sklearn.metrics import mean_squared_error, r2_score
import statsmodels.api as sm
import statsmodels.formula.api as smf
%matplotlib inline
plt.style.use('seaborn-white')
C:\ProgramData\Anaconda3\lib\site-packages\statsmodels\compat\pandas.py:56: FutureWarning: The pandas.core.datetools module is deprecated and will be removed in a future version. Please use the pandas.tseries module instead. from pandas.core import datetools
The example datasets on this tutorial come from the book: An Introduction to Statistical Learning with Applications in R
We will be using the Auto.csv and Advertising.csv datasets.
advertising = pd.read_csv('https://raw.githubusercontent.com/colaberry/DSin100days/master/data/Advertising.csv', usecols=[1,2,3,4])
advertising.info()
advertising.head()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 200 entries, 0 to 199 Data columns (total 4 columns): TV 200 non-null float64 Radio 200 non-null float64 Newspaper 200 non-null float64 Sales 200 non-null float64 dtypes: float64(4) memory usage: 6.3 KB
TV | Radio | Newspaper | Sales | |
---|---|---|---|---|
0 | 230.1 | 37.8 | 69.2 | 22.1 |
1 | 44.5 | 39.3 | 45.1 | 10.4 |
2 | 17.2 | 45.9 | 69.3 | 9.3 |
3 | 151.5 | 41.3 | 58.5 | 18.5 |
4 | 180.8 | 10.8 | 58.4 | 12.9 |
auto = pd.read_csv('https://raw.githubusercontent.com/colaberry/DSin100days/master/data/Auto.csv', na_values='?').dropna()
auto.info()
auto.head()
<class 'pandas.core.frame.DataFrame'> Int64Index: 392 entries, 0 to 396 Data columns (total 9 columns): mpg 392 non-null float64 cylinders 392 non-null int64 displacement 392 non-null float64 horsepower 392 non-null float64 weight 392 non-null int64 acceleration 392 non-null float64 year 392 non-null int64 origin 392 non-null int64 name 392 non-null object dtypes: float64(4), int64(4), object(1) memory usage: 30.6+ KB
mpg | cylinders | displacement | horsepower | weight | acceleration | year | origin | name | |
---|---|---|---|---|---|---|---|---|---|
0 | 18.0 | 8 | 307.0 | 130.0 | 3504 | 12.0 | 70 | 1 | chevrolet chevelle malibu |
1 | 15.0 | 8 | 350.0 | 165.0 | 3693 | 11.5 | 70 | 1 | buick skylark 320 |
2 | 18.0 | 8 | 318.0 | 150.0 | 3436 | 11.0 | 70 | 1 | plymouth satellite |
3 | 16.0 | 8 | 304.0 | 150.0 | 3433 | 12.0 | 70 | 1 | amc rebel sst |
4 | 17.0 | 8 | 302.0 | 140.0 | 3449 | 10.5 | 70 | 1 | ford torino |
sns.regplot(advertising.TV, advertising.Sales, order=1, ci=None, scatter_kws={'color':'r', 's':9})
plt.xlim(-10,310)
plt.ylim(ymin=0);
import plotly.plotly as py
import plotly.graph_objs as go
# Create a trace
trace = go.Scatter(
x = advertising.TV,
y = advertising.Sales,
mode = 'markers'
)
data = [trace]
# Plot and embed in ipython notebook!
py.iplot(data, filename='basic-scatter')
Let us do an interactive visualization of the scatterplot to see how TV Advertising affects the Sales, with some interactions. Select some outliers visually and print them.
from __future__ import print_function
from bqplot import *
import numpy as np
import pandas as pd
from ipywidgets import Layout
x_sc = LinearScale()
y_sc = LinearScale()
x_data = np.arange(20)
y_data = np.random.randn(20)
scatter_chart = Scatter(x=advertising.TV, y=advertising.Sales, scales= {'x': x_sc, 'y': y_sc}, colors=['dodgerblue'],
interactions={'click': 'select'},
selected_style={'opacity': 1.0, 'fill': 'DarkOrange', 'stroke': 'Red'},
unselected_style={'opacity': 0.5})
ax_x = Axis(scale=x_sc)
ax_y = Axis(scale=y_sc, orientation='vertical', tick_format='0.2f')
Figure(marks=[scatter_chart], axes=[ax_x, ax_y])
Figure(axes=[Axis(scale=LinearScale()), Axis(orientation='vertical', scale=LinearScale(), tick_format='0.2f')]…
#If you need to find out the visually selected points and print them as array, you could use .selected attribute.
scatter_chart.selected
[]
Note that the text in the ISLR book describes the coefficients based on uncentered data, whereas the plot shows the model based on centered data. The latter is visually more appealing for explaining the concept of a minimum RSS. I think that, in order not to confuse the reader, the values on the axis of the B0 coefficients have been changed to correspond with the text. The axes on the plots below are unaltered.
# Regression coefficients (Ordinary Least Squares)
regr = skl_lm.LinearRegression()
X = scale(advertising.TV, with_mean=True, with_std=False).reshape(-1,1)
y = advertising.Sales
regr.fit(X,y)
print(regr.intercept_)
print(regr.coef_)
14.0225 [ 0.04753664]
# Create grid coordinates for plotting
B0 = np.linspace(regr.intercept_-2, regr.intercept_+2, 50)
B1 = np.linspace(regr.coef_-0.02, regr.coef_+0.02, 50)
xx, yy = np.meshgrid(B0, B1, indexing='xy')
Z = np.zeros((B0.size,B1.size))
# Calculate Z-values (RSS) based on grid of coefficients
for (i,j),v in np.ndenumerate(Z):
Z[i,j] =((y - (xx[i,j]+X.ravel()*yy[i,j]))**2).sum()/1000
# Minimized RSS
min_RSS = r'$\beta_0$, $\beta_1$ for minimized RSS'
min_rss = np.sum((regr.intercept_+regr.coef_*X - y.values.reshape(-1,1))**2)/1000
min_rss
2.1025305831313514
fig = plt.figure(figsize=(15,6))
fig.suptitle('RSS - Regression coefficients', fontsize=20)
ax1 = fig.add_subplot(121)
ax2 = fig.add_subplot(122, projection='3d')
# Left plot
CS = ax1.contour(xx, yy, Z, cmap=plt.cm.Set1, levels=[2.15, 2.2, 2.3, 2.5, 3])
ax1.scatter(regr.intercept_, regr.coef_[0], c='r', label=min_RSS)
ax1.clabel(CS, inline=True, fontsize=10, fmt='%1.1f')
# Right plot
ax2.plot_surface(xx, yy, Z, rstride=3, cstride=3, alpha=0.3)
ax2.contour(xx, yy, Z, zdir='z', offset=Z.min(), cmap=plt.cm.Set1,
alpha=0.4, levels=[2.15, 2.2, 2.3, 2.5, 3])
ax2.scatter3D(regr.intercept_, regr.coef_[0], min_rss, c='r', label=min_RSS)
ax2.set_zlabel('RSS')
ax2.set_zlim(Z.min(),Z.max())
ax2.set_ylim(0.02,0.07)
# settings common to both plots
for ax in fig.axes:
ax.set_xlabel(r'$\beta_0$', fontsize=17)
ax.set_ylabel(r'$\beta_1$', fontsize=17)
ax.set_yticks([0.03,0.04,0.05,0.06])
ax.legend()
est = smf.ols('Sales ~ TV', advertising).fit()
est.summary().tables[1]
coef | std err | t | P>|t| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
Intercept | 7.0326 | 0.458 | 15.360 | 0.000 | 6.130 | 7.935 |
TV | 0.0475 | 0.003 | 17.668 | 0.000 | 0.042 | 0.053 |
# RSS with regression coefficients
((advertising.Sales - (est.params[0] + est.params[1]*advertising.TV))**2).sum()/1000
2.1025305831313506
regr = skl_lm.LinearRegression()
X = advertising.TV.values.reshape(-1,1)
y = advertising.Sales
regr.fit(X,y)
print(regr.intercept_)
print(regr.coef_)
7.03259354913 [ 0.04753664]
Sales_pred = regr.predict(X)
r2_score(y, Sales_pred)
0.61187505085007099
Next up, we will plot the regression line for TV Vs Sales, Then move some outliers around to see how it impacts the regression line. This would give us an idea if we need to remove some outliers to get a more consistent model with the general direction of the data.
import bqplot.marks as bqm
import bqplot.scales as bqs
import bqplot.axes as bqa
import bqplot as bq
from IPython.display import display
import ipywidgets as widgets
print('ipywidgets version', widgets.__version__)
print('bqplot version', bq.__version__)
def update_line(change):
# create line fit to data and display equation
lin.x = [np.min(scat.x), np.max(scat.x)]
poly = np.polyfit(scat.x, scat.y, 1)
lin.y = np.polyval(poly, lin.x)
label.value = 'y = {:.2f} + {:.2f}x'.format(poly[1], poly[0])
# create initial data set
size = 10
np.random.seed(0)
x_data = advertising.TV
y_data = advertising.Sales
# set up plot elements
sc_x = bqs.LinearScale()
sc_y = bqs.LinearScale()
ax_x = bqa.Axis(scale=sc_x)
ax_y = bqa.Axis(scale=sc_y, tick_format='0.2f', orientation='vertical')
# place data on scatter plot that allows point dragging
scat = bqm.Scatter(x=x_data,
y=y_data,
scales={'x': sc_x, 'y': sc_y},
enable_move=True)
# set up callback
scat.observe(update_line, names=['x', 'y'])
# linear fit line
lin = bqm.Lines(scales={'x': sc_x, 'y': sc_y})
# equation label
label = widgets.Label()
# containers
fig = bq.Figure(marks=[scat, lin], axes=[ax_x, ax_y])
box = widgets.VBox([label, fig])
# initialize plot and equation and display
update_line(None)
display(box)
ipywidgets version 7.2.1 bqplot version 0.10.5
VBox(children=(Label(value='y = 7.03 + 0.05x'), Figure(axes=[Axis(scale=LinearScale()), Axis(orientation='vert…
Let us start by including two separate variables (Radio and NewsPaper) in two different models.
est = smf.ols('Sales ~ Radio', advertising).fit()
est.summary().tables[1]
coef | std err | t | P>|t| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
Intercept | 9.3116 | 0.563 | 16.542 | 0.000 | 8.202 | 10.422 |
Radio | 0.2025 | 0.020 | 9.921 | 0.000 | 0.162 | 0.243 |
est = smf.ols('Sales ~ Newspaper', advertising).fit()
est.summary().tables[1]
coef | std err | t | P>|t| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
Intercept | 12.3514 | 0.621 | 19.876 | 0.000 | 11.126 | 13.577 |
Newspaper | 0.0547 | 0.017 | 3.300 | 0.001 | 0.022 | 0.087 |
Now let us including the two variables (Radio and NewsPaper) together in a single model.
est = smf.ols('Sales ~ TV + Radio + Newspaper', advertising).fit()
est.summary()
Dep. Variable: | Sales | R-squared: | 0.897 |
---|---|---|---|
Model: | OLS | Adj. R-squared: | 0.896 |
Method: | Least Squares | F-statistic: | 570.3 |
Date: | Wed, 20 Jun 2018 | Prob (F-statistic): | 1.58e-96 |
Time: | 19:03:45 | Log-Likelihood: | -386.18 |
No. Observations: | 200 | AIC: | 780.4 |
Df Residuals: | 196 | BIC: | 793.6 |
Df Model: | 3 | ||
Covariance Type: | nonrobust |
coef | std err | t | P>|t| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
Intercept | 2.9389 | 0.312 | 9.422 | 0.000 | 2.324 | 3.554 |
TV | 0.0458 | 0.001 | 32.809 | 0.000 | 0.043 | 0.049 |
Radio | 0.1885 | 0.009 | 21.893 | 0.000 | 0.172 | 0.206 |
Newspaper | -0.0010 | 0.006 | -0.177 | 0.860 | -0.013 | 0.011 |
Omnibus: | 60.414 | Durbin-Watson: | 2.084 |
---|---|---|---|
Prob(Omnibus): | 0.000 | Jarque-Bera (JB): | 151.241 |
Skew: | -1.327 | Prob(JB): | 1.44e-33 |
Kurtosis: | 6.332 | Cond. No. | 454. |
advertising.corr()
TV | Radio | Newspaper | Sales | |
---|---|---|---|---|
TV | 1.000000 | 0.054809 | 0.056648 | 0.782224 |
Radio | 0.054809 | 1.000000 | 0.354104 | 0.576223 |
Newspaper | 0.056648 | 0.354104 | 1.000000 | 0.228299 |
Sales | 0.782224 | 0.576223 | 0.228299 | 1.000000 |
Let us do a multiple regression using sklearn.
regr = skl_lm.LinearRegression()
X = advertising[['Radio', 'TV']].as_matrix()
y = advertising.Sales
regr.fit(X,y)
print(regr.coef_)
print(regr.intercept_)
regr_model = regr
def predict(tv, radio):
data = pd.DataFrame({'TV': [tv], 'Radio': [radio]})
return regr_model.predict(data)
#prediction = predict(advertising[['TV','Radio']])
prediction = predict(180.8,10.8)
print(prediction)
[ 0.18799423 0.04575482] 2.92109991241 [ 13.22390813]
Let us perform a 3-d visualization of the hyperplane using matplotlib. Here are the steps:
This involves bit of python coding as below.
# What are the min/max values of Radio & TV?
# Use these values to set up the grid for plotting.
advertising[['Radio', 'TV']].describe()
Radio | TV | |
---|---|---|
count | 200.000000 | 200.000000 |
mean | 23.264000 | 147.042500 |
std | 14.846809 | 85.854236 |
min | 0.000000 | 0.700000 |
25% | 9.975000 | 74.375000 |
50% | 22.900000 | 149.750000 |
75% | 36.525000 | 218.825000 |
max | 49.600000 | 296.400000 |
# Create a coordinate grid
Radio = np.arange(0,50)
TV = np.arange(0,300)
B1, B2 = np.meshgrid(Radio, TV, indexing='xy')
Z = np.zeros((TV.size, Radio.size))
# Here is the place where we tilt and elevate the hyperplane
for (i,j),v in np.ndenumerate(Z):
Z[i,j] =(regr.intercept_ + B1[i,j]*regr.coef_[0] + B2[i,j]*regr.coef_[1])
# Create plot
fig = plt.figure(figsize=(10,6))
fig.suptitle('Regression: Sales ~ Radio + TV Advertising', fontsize=20)
ax = axes3d.Axes3D(fig)
ax.plot_surface(B1, B2, Z, rstride=10, cstride=5, alpha=0.4)
ax.scatter3D(advertising.Radio, advertising.TV, advertising.Sales, c='r')
ax.set_xlabel('Radio')
ax.set_xlim(0,50)
ax.set_ylabel('TV')
ax.set_ylim(ymin=0)
ax.set_zlabel('Sales');
est = smf.ols('Sales ~ TV + Radio + TV*Radio', advertising).fit()
est.summary().tables[1]
coef | std err | t | P>|t| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
Intercept | 6.7502 | 0.248 | 27.233 | 0.000 | 6.261 | 7.239 |
TV | 0.0191 | 0.002 | 12.699 | 0.000 | 0.016 | 0.022 |
Radio | 0.0289 | 0.009 | 3.241 | 0.001 | 0.011 | 0.046 |
TV:Radio | 0.0011 | 5.24e-05 | 20.727 | 0.000 | 0.001 | 0.001 |
# With Seaborn's regplot() you can easily plot higher order polynomials.
plt.scatter(auto.horsepower, auto.mpg, facecolors='None', edgecolors='k', alpha=.5)
sns.regplot(auto.horsepower, auto.mpg, ci=None, label='Linear', scatter=False, color='orange')
sns.regplot(auto.horsepower, auto.mpg, ci=None, label='Degree 2', order=2, scatter=False, color='lightblue')
sns.regplot(auto.horsepower, auto.mpg, ci=None, label='Degree 5', order=5, scatter=False, color='g')
plt.legend()
plt.ylim(5,55)
plt.xlim(40,240);
# Scientific libraries
from numpy import arange,array,ones
from scipy import stats
xi = arange(0,9)
A = array([ xi, ones(9)])
# (Almost) linear sequence
y = [19, 20, 20.5, 21.5, 22, 23, 23, 25.5, 24]
# Generated linear fit
slope, intercept, r_value, p_value, std_err = stats.linregress(auto.horsepower,auto.mpg)
line = slope*auto.horsepower+intercept
print(slope)
print(intercept)
# Creating the dataset, and generating the plot
trace1 = go.Scatter(
x = auto.horsepower,
y = auto.mpg,
mode='markers',
marker=go.Marker(color='rgb(255, 127, 14)'),
name='Data'
)
trace2 = go.Scatter(
x=auto.horsepower,
y=line,
mode='lines',
marker=go.Marker(color='rgb(31, 119, 180)'),
name='Fit'
)
#annotation = go.Annotation(
# x=3.5,
# y=23.5,
# text='$R^2 = 0.9551,\\Y = 0.716X + 19.18$',
# showarrow=False,
# font=go.Font(size=16)
# )
layout = go.Layout(
title='Linear Fit in Python',
plot_bgcolor='rgb(229, 229, 229)',
xaxis=go.XAxis(zerolinecolor='rgb(255,255,255)', gridcolor='rgb(255,255,255)'),
yaxis=go.YAxis(zerolinecolor='rgb(255,255,255)', gridcolor='rgb(255,255,255)')
# annotations=[annotation]
)
data = [trace1, trace2]
fig = go.Figure(data=data, layout=layout)
py.iplot(fig, filename='Linear-Fit-in-python')
-0.157844733354 39.9358610212
# Generated linear fit
slope2, intercept2, r_value2, p_value2, std_err2 = stats.linregress(advertising.TV,advertising.Sales)
line2 = slope2*advertising.TV+intercept2
auto['horsepower2'] = auto.horsepower**2
auto.head(3)
mpg | cylinders | displacement | horsepower | weight | acceleration | year | origin | name | horsepower2 | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 18.0 | 8 | 307.0 | 130.0 | 3504 | 12.0 | 70 | 1 | chevrolet chevelle malibu | 16900.0 |
1 | 15.0 | 8 | 350.0 | 165.0 | 3693 | 11.5 | 70 | 1 | buick skylark 320 | 27225.0 |
2 | 18.0 | 8 | 318.0 | 150.0 | 3436 | 11.0 | 70 | 1 | plymouth satellite | 22500.0 |
est = smf.ols('mpg ~ horsepower + horsepower2', auto).fit()
est.summary().tables[1]
coef | std err | t | P>|t| | [0.025 | 0.975] | |
---|---|---|---|---|---|---|
Intercept | 56.9001 | 1.800 | 31.604 | 0.000 | 53.360 | 60.440 |
horsepower | -0.4662 | 0.031 | -14.978 | 0.000 | -0.527 | -0.405 |
horsepower2 | 0.0012 | 0.000 | 10.080 | 0.000 | 0.001 | 0.001 |
regr = skl_lm.LinearRegression()
# Linear fit
X = auto.horsepower.values.reshape(-1,1)
y = auto.mpg
regr.fit(X, y)
auto['pred1'] = regr.predict(X)
auto['resid1'] = auto.mpg - auto.pred1
# Quadratic fit
X2 = auto[['horsepower', 'horsepower2']].as_matrix()
regr.fit(X2, y)
auto['pred2'] = regr.predict(X2)
auto['resid2'] = auto.mpg - auto.pred2
fig, (ax1,ax2) = plt.subplots(1,2, figsize=(12,5))
# Left plot
sns.regplot(auto.pred1, auto.resid1, lowess=True,
ax=ax1, line_kws={'color':'r', 'lw':1},
scatter_kws={'facecolors':'None', 'edgecolors':'k', 'alpha':0.5})
ax1.hlines(0,xmin=ax1.xaxis.get_data_interval()[0],
xmax=ax1.xaxis.get_data_interval()[1], linestyles='dotted')
ax1.set_title('Residual Plot for Linear Fit')
# Right plot
sns.regplot(auto.pred2, auto.resid2, lowess=True,
line_kws={'color':'r', 'lw':1}, ax=ax2,
scatter_kws={'facecolors':'None', 'edgecolors':'k', 'alpha':0.5})
ax2.hlines(0,xmin=ax2.xaxis.get_data_interval()[0],
xmax=ax2.xaxis.get_data_interval()[1], linestyles='dotted')
ax2.set_title('Residual Plot for Quadratic Fit')
for ax in fig.axes:
ax.set_xlabel('Fitted values')
ax.set_ylabel('Residuals')
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
from IPython import display
# Here is the App display embedded in the Jupyter notebook.
#This can be run only once in offline mode.
def show_app(app, # type: dash.Dash
port=9999,
width=1000,
height=350,
offline=True,
style=True,
**dash_flask_kwargs):
"""
Run the application inside a Jupyter notebook and show an iframe with it
:param app:
:param port:
:param width:
:param height:
:param offline:
:return:
"""
url = 'http://localhost:%d' % port
iframe = '<iframe src="{url}" width={width} height={height}></iframe>'.format(url=url,
width=width,
height=height)
display.display_html(iframe, raw=True)
if offline:
app.css.config.serve_locally = False
app.scripts.config.serve_locally = False
if style:
external_css = ["https://fonts.googleapis.com/css?family=Raleway:400,300,600",
"https://maxcdn.bootstrapcdn.com/font-awesome/4.7.0/css/font-awesome.min.css",
"http://getbootstrap.com/dist/css/bootstrap.min.css",
"https://codepen.io/chriddyp/pen/bWLwgP.css"]
for css in external_css:
app.css.append_css({"external_url": css})
external_js = ["https://code.jquery.com/jquery-3.2.1.min.js",
"https://cdn.rawgit.com/plotly/dash-app-stylesheets/a3401de132a6d0b652ba11548736b1d1e80aa10d/dash-goldman-sachs-report-js.js",
"http://getbootstrap.com/dist/js/bootstrap.min.js"]
for js in external_js:
app.scripts.append_script({"external_url": js})
return app.run_server(debug=True, # needs to be false in Jupyter
port=port,
**dash_flask_kwargs)
# Here is the Dash App Layout and interaction behaviour defined.
app = dash.Dash()
#stylesheets = {'stylesheet.css': 'https://codepen.io/chriddyp/pen/bWLwgP.css'}
app.layout = html.Div(children=[
html.Link(
rel='stylesheet',
href='/static/bWLwgP.css'
),
html.H1(children='Predicting Sales using Regression'),
html.Div(children=[html.Label('TV Advertising Spend in $million '),
dcc.Input(id='tv-id', value='230.1', type='text')]),
html.Div(children=[html.Label('Radio Advertising Spend in $million '),
dcc.Input(id='radio-id', value='37.8', type='text')]),
html.Div(id='predicted-div'),
dcc.Graph(
id='advert-graph',
figure={
'data': [
go.Scatter(
x=advertising['TV'],
y=advertising['Sales'],
#text='' + df2['TV'],
mode='markers',
opacity=0.7,
marker={
'size': 15,
'line': {'width': 0.5, 'color': 'white'}
}
)
],
'layout': go.Layout(
xaxis={'type': 'log', 'title': 'TV Ad Spend'},
yaxis={'title': 'Sales Turnover'},
margin={'l': 40, 'b': 40, 't': 10, 'r': 10},
legend={'x': 0, 'y': 1},
hovermode='closest'
)
}
)
])
@app.callback(
Output(component_id='predicted-div', component_property='children'),
[Input(component_id='tv-id', component_property='value'),
Input(component_id='radio-id', component_property='value')]
)
def update_output_div(tv_value,radio_value):
prediction = predict(tv_value, radio_value)
return 'You\'ve entered TV Spend as "{}" and Radio spend as "{}". And the predicted sales is "{}"'.format(tv_value, radio_value, prediction[0])
show_app(app)
* Debugger is active! * Debugger PIN: 166-122-810 * Running on http://127.0.0.1:9999/ (Press CTRL+C to quit) 127.0.0.1 - - [20/Jun/2018 19:04:03] "GET / HTTP/1.1" 200 - 127.0.0.1 - - [20/Jun/2018 19:04:03] "GET /_dash-dependencies HTTP/1.1" 200 - 127.0.0.1 - - [20/Jun/2018 19:04:03] "GET /_dash-layout HTTP/1.1" 200 - 127.0.0.1 - - [20/Jun/2018 19:04:03] "POST /_dash-update-component HTTP/1.1" 200 -