Predicting Stock Prices

This project is taken up to demonstrate Machine Learning's ability to predict one of the most challenging problems in financial world - "to predict the unpredictable" - predict the stock price.

In this project, I have used only those techniques which we have studied in Topic 9 of the course with regards to Time Series analysis.

For a use case to show the predictive power of very simple algorithms such as Lasso AND Ridge regressions, I have downloaded the data for a very famous stock in India - "TATA MOTORS".

The link to this data is mentioned here -

https://in.finance.yahoo.com/quote/TATAMOTORS.NS/history?period1=662754600&period2=1544985000&interval=1d&filter=history&frequency=1d

There are few important characteristics which I would like to outline here:

  1. We have around 17 years of information (from 02 Jan 1991 till 14 Dec 2018)
  1. There are two types of prices given in the data:

    a. Closing Prices which do not take into account of any corporate actions in the prices such as declaration of dividends or split of shares effect.

    b. Adjusted Closing Prices which do take care of effect of Dividend payments and stock split. [This will be our target variable]

  1. There are other information available as well in the data which may not be useful for our analysis point of view.

So, let's dive in.

First we will download all basic libraries into Pythnon

Loading Libraries and Data

In [1]:
import pandas as pd
import numpy as np
from fbprophet import Prophet
import matplotlib.pyplot as plt
%matplotlib inline

#setting figure size
from matplotlib.pyplot import rcParams
rcParams['figure.figsize'] = 20,10

#for normalizing data
from sklearn.preprocessing import MinMaxScaler
scaler = MinMaxScaler(feature_range=(0, 1))

import warnings
warnings.filterwarnings('ignore')
In [2]:
# loading basic ML algoriths
from sklearn.linear_model import LinearRegression, RidgeCV, LassoCV
from sklearn.metrics import mean_absolute_error, mean_squared_error
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import TimeSeriesSplit, cross_val_score

# few powerful algorithms as well which we will see later dont perform well compared to basic algorithms
import xgboost
import lightgbm
In [3]:
# load the downloaded data
import os
os.chdir('C:\\Users\\Abhik\\mlcourse.ai\\mlcourse.ai-master\\data')
In [4]:
# load the data into pandas dataframe
df = pd.read_csv('TATAMOTORS.NS.csv')
df.head()
Out[4]:
Date Open High Low Close Adj Close Volume
0 1991-01-02 20.9596 21.857901 20.9596 21.857901 4.031160 0.0
1 1991-01-03 20.9596 21.857901 20.9596 21.857901 4.031160 0.0
2 1991-01-04 NaN NaN NaN NaN NaN NaN
3 1991-01-07 20.3608 21.259001 20.0613 21.109301 3.893099 0.0
4 1991-01-08 NaN NaN NaN NaN NaN NaN
In [5]:
# since there are few NaN values, we should remove these first
df.dropna(axis=0, inplace=True)
In [6]:
# Lets check the data once again
df.head(6)
Out[6]:
Date Open High Low Close Adj Close Volume
0 1991-01-02 20.959600 21.857901 20.9596 21.857901 4.031160 0.0
1 1991-01-03 20.959600 21.857901 20.9596 21.857901 4.031160 0.0
3 1991-01-07 20.360800 21.259001 20.0613 21.109301 3.893099 0.0
5 1991-01-09 21.259001 21.259001 20.0613 20.510500 3.782665 0.0
7 1991-01-11 20.360800 20.959600 20.0613 20.959600 3.865490 0.0
8 1991-01-14 20.660200 20.660200 20.0613 20.360800 3.755056 0.0
In [7]:
# We now need to convert the Dates into Pandas Date format
df['Date'] = pd.to_datetime(df.Date,format='%Y-%m-%d')
df.index = df['Date']
In [8]:
# Better to check the data once again
df.head()
Out[8]:
Date Open High Low Close Adj Close Volume
Date
1991-01-02 1991-01-02 20.959600 21.857901 20.9596 21.857901 4.031160 0.0
1991-01-03 1991-01-03 20.959600 21.857901 20.9596 21.857901 4.031160 0.0
1991-01-07 1991-01-07 20.360800 21.259001 20.0613 21.109301 3.893099 0.0
1991-01-09 1991-01-09 21.259001 21.259001 20.0613 20.510500 3.782665 0.0
1991-01-11 1991-01-11 20.360800 20.959600 20.0613 20.959600 3.865490 0.0

EDA and Feature Engineering

In [9]:
# Plot the Graph for Adjusted Closing Price

from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import plotly
import plotly.graph_objs as go

init_notebook_mode(connected=True)
In [10]:
trace1 = go.Scatter(
    x=df.Date,
    y=df['Adj Close'],
    name='Closing Price'
)
data = [trace1]
layout = {'title': 'Adjusted Closing Price'}
fig = go.Figure(data=data, layout=layout)
iplot(fig, show_link=False)
In [11]:
# Shape of the Data
df.shape
Out[11]:
(6734, 7)
In [12]:
# Lets create a new dataset in which we will only store the required inputs.

#setting index as date values
df['Date'] = pd.to_datetime(df.Date,format='%Y-%m-%d')
df.index = df['Date']

#sorting
data = df.sort_index(ascending=True, axis=0)

#creating a separate dataset
new_data = pd.DataFrame(index=range(0,len(df)),columns=['Date', 'Close'])

for i in range(0,len(data)):
    new_data['Date'][i] = data['Date'][i]
    new_data['Close'][i] = data['Adj Close'][i]
    
In [13]:
# Lets check the Data once again
new_data.head()
Out[13]:
Date Close
0 1991-01-02 00:00:00 4.03116
1 1991-01-03 00:00:00 4.03116
2 1991-01-07 00:00:00 3.8931
3 1991-01-09 00:00:00 3.78267
4 1991-01-11 00:00:00 3.86549
In [14]:
# We will create a number of features on the Dates

new_data['year'] = new_data['Date'].map(lambda x : x.year)
new_data['month'] = new_data['Date'].map(lambda x : x.month)
new_data['day_week'] = new_data['Date'].map(lambda x : x.dayofweek)
new_data['quarter'] = new_data['Date'].map(lambda x : x.quarter)
new_data['week'] = new_data['Date'].map(lambda x : x.week)
new_data['quarter_start'] = new_data['Date'].map(lambda x : x.is_quarter_start)
new_data['quarter_end'] = new_data['Date'].map(lambda x : x.is_quarter_end)
new_data['month_start'] = new_data['Date'].map(lambda x : x.is_month_start)
new_data['month_end'] = new_data['Date'].map(lambda x : x.is_month_end)
new_data['year_start'] = new_data['Date'].map(lambda x : x.is_year_start)
new_data['year_end'] = new_data['Date'].map(lambda x : x.is_year_end)
new_data['week_year'] = new_data['Date'].map(lambda x : x.weekofyear)
new_data['quarter_start'] = new_data['quarter_start'].map(lambda x: 0 if x is False else 1)
new_data['quarter_end'] = new_data['quarter_end'].map(lambda x: 0 if x is False else 1)
new_data['month_start'] = new_data['month_start'].map(lambda x: 0 if x is False else 1)
new_data['month_end'] = new_data['month_end'].map(lambda x: 0 if x is False else 1)
new_data['year_start'] = new_data['year_start'].map(lambda x: 0 if x is False else 1)
new_data['year_end'] = new_data['year_end'].map(lambda x: 0 if x is False else 1)
new_data['day_month'] = new_data['Date'].map(lambda x: x.daysinmonth)

# Create a feature which could be important - Markets are only open between Monday and Friday.
mon_fri_list = [0,4]
new_data['mon_fri'] = new_data['day_week'].map(lambda x: 1 if x in mon_fri_list  else 0)
In [15]:
# Re-indexing the data
new_data.index = new_data['Date']
new_data.drop('Date', inplace=True, axis=1)
new_data.head(2)
Out[15]:
Close year month day_week quarter week quarter_start quarter_end month_start month_end year_start year_end week_year day_month mon_fri
Date
1991-01-02 4.03116 1991 1 2 1 1 0 0 0 0 0 0 1 31 0
1991-01-03 4.03116 1991 1 3 1 1 0 0 0 0 0 0 1 31 0

Lags are very important features which need to be created for any time-series prediction as it will define the auto-correlation effect between past observations.

Here we have taken the lag period of 1 to 22 days (since the market opens for around 22 days in a month)

In [16]:
for i in range(1, 22):
        new_data["lag_{}".format(i)] = new_data.Close.shift(i)
In [17]:
new_data.head(3)
Out[17]:
Close year month day_week quarter week quarter_start quarter_end month_start month_end ... lag_12 lag_13 lag_14 lag_15 lag_16 lag_17 lag_18 lag_19 lag_20 lag_21
Date
1991-01-02 4.03116 1991 1 2 1 1 0 0 0 0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1991-01-03 4.03116 1991 1 3 1 1 0 0 0 0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1991-01-07 3.8931 1991 1 0 1 2 0 0 0 0 ... NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

3 rows × 36 columns

In [18]:
# Lets create dummies for categorical features

cols = ['year', 'month', 'day_week', 'quarter', 'week', 
        'quarter_start', 'quarter_end', 'week_year', 'mon_fri', 'year_start', 'year_end',
       'month_start', 'month_end', 'day_month']

for i in cols:
    new_data = pd.concat([new_data.drop([i], axis=1), 
        pd.get_dummies(new_data[i], prefix=i)
    ], axis=1)
In [19]:
# Droping NAs if any and re-indexing again

new_data = new_data.dropna()
new_data = new_data.reset_index(drop=True)
In [20]:
new_data.head()
Out[20]:
Close lag_1 lag_2 lag_3 lag_4 lag_5 lag_6 lag_7 lag_8 lag_9 ... year_end_0 year_end_1 month_start_0 month_start_1 month_end_0 month_end_1 day_month_28 day_month_29 day_month_30 day_month_31
0 3.67221 3.75506 3.25807 3.23044 3.17522 3.3685 3.47893 3.58939 3.69982 3.69982 ... 1 0 1 0 1 0 1 0 0 0
1 3.6446 3.67221 3.75506 3.25807 3.23044 3.17522 3.3685 3.47893 3.58939 3.69982 ... 1 0 1 0 1 0 1 0 0 0
2 3.81027 3.6446 3.67221 3.75506 3.25807 3.23044 3.17522 3.3685 3.47893 3.58939 ... 1 0 1 0 1 0 1 0 0 0
3 4.11399 3.81027 3.6446 3.67221 3.75506 3.25807 3.23044 3.17522 3.3685 3.47893 ... 1 0 1 0 1 0 1 0 0 0
4 4.25205 4.11399 3.81027 3.6446 3.67221 3.75506 3.25807 3.23044 3.17522 3.3685 ... 1 0 1 0 1 0 1 0 0 0

5 rows × 195 columns

In [21]:
new_data.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6713 entries, 0 to 6712
Columns: 195 entries, Close to day_month_31
dtypes: object(22), uint8(173)
memory usage: 2.2+ MB
In [22]:
# Target Variable
y = new_data.Close.values
y
Out[22]:
array([3.6722120000000005, 3.644603, 3.8102730000000005, ..., 164.100006,
       166.850006, 166.949997], dtype=object)

Splitting the Data into Train-Test

In [23]:
# Creating splitting index

test_index = int(len(new_data) * (1 - 0.30))
test_index
Out[23]:
4699

Since we dont want to look into immediate future, we are creating a window of 2 days. This means, training data will stop at day x-1 and test data will start at x+1.

In [24]:
# splitting whole dataset on train and test

X_train = new_data.loc[:test_index-1].drop(['Close'], axis=1)
y_train = new_data.loc[:test_index-1]["Close"]
X_test = new_data.loc[test_index+1:].drop(["Close"], axis=1)
y_test = new_data.loc[test_index+1:]["Close"]  
In [25]:
# Lets visualize the train and test data together
plt.figure(figsize=(16,8))
plt.plot(y_train)
plt.plot(y_test)
Out[25]:
[<matplotlib.lines.Line2D at 0xeadfeb8>]
In [26]:
# Scaling the Data

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

Machine Learning implementations

In [27]:
# First we will use the simplest of them all - Linear Regression

from sklearn.linear_model import LinearRegression, LassoCV, RidgeCV, Lasso, Ridge
lr = LinearRegression()
lr.fit(X_train_scaled, y_train)
Out[27]:
LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None,
         normalize=False)

For Cross Validation (CV) on Time Series data, we will use TimeSeries Split for CV.

Let's see Mean Absolute Error for our simplest model

In [28]:
from sklearn.model_selection import TimeSeriesSplit
from sklearn.model_selection import cross_val_score
tscv = TimeSeriesSplit(n_splits=5)
cv = cross_val_score(lr, X_train_scaled, y_train, scoring = 'neg_mean_absolute_error', cv=tscv)
mae = cv.mean()*(-1)
mae
Out[28]:
6573780029897.434

Oh Gosh!! Linear Regression failed miserably to predict the pattern. Lets try regularized linear models.

But before that, we will use the plotting module written in Topic 9 of the course to plot some nice graphs

In [29]:
def plotModelResults(model, df_train, df_test, y_train, y_test, plot_intervals=False, plot_anomalies=False, scale=1.96, cv=tscv):
    """
    Plots modelled vs fact values
    
    model: fitted model 
    
    df_train, df_test: splitted featuresets
    
    y_train, y_test: targets
    
    plot_intervals: bool, if True, plot prediction intervals
    
    scale: float, sets the width of the intervals
    
    cv: cross validation method, needed for intervals
    
    """
    # making predictions for test
    prediction = model.predict(df_test)
    
    plt.figure(figsize=(20, 7))
    plt.plot(prediction, "g", label="prediction", linewidth=2.0)
    plt.plot(y_test.values, label="actual", linewidth=2.0)
    
    if plot_intervals:
        # calculate cv scores
        cv = cross_val_score(
            model, 
            df_train, 
            y_train, 
            cv=cv, 
            scoring="neg_mean_squared_error"
        )

        # calculate cv error deviation
        deviation = np.sqrt(cv.std())
        
        # calculate lower and upper intervals
        lower = prediction - (scale * deviation)
        upper = prediction + (scale * deviation)
        
        plt.plot(lower, "r--", label="upper bond / lower bond", alpha=0.5)
        plt.plot(upper, "r--", alpha=0.5)
        
    if plot_anomalies:
            anomalies = np.array([np.NaN]*len(y_test))
            anomalies[y_test<lower] = y_test[y_test<lower]
            anomalies[y_test>upper] = y_test[y_test>upper]
            plt.plot(anomalies, "o", markersize=10, label = "Anomalies")
        
    # calculate overall quality on test set
    mae  = mean_absolute_error(prediction, y_test)
    mape = mean_absolute_percentage_error(prediction, y_test)
    plt.title("MAE {}, MAPE {}%".format(round(mae), round(mape, 2)))
    plt.legend(loc="best")
    plt.grid(True);

Another plotting module for Coefficients

In [30]:
def getCoefficients(model):
    """Returns sorted coefficient values of the model"""
    coefs = pd.DataFrame(model.coef_, X_train.columns)
    coefs.columns = ["coef"]
    coefs["abs"] = coefs.coef.apply(np.abs)
    return coefs.sort_values(by="abs", ascending=False).drop(["abs"], axis=1)    
    

def plotCoefficients(model):
    """Plots sorted coefficient values of the model"""
    coefs = getCoefficients(model)
    
    plt.figure(figsize=(20, 7))
    coefs.coef.plot(kind='bar')
    plt.grid(True, axis='y')
    plt.hlines(y=0, xmin=0, xmax=len(coefs), linestyles='dashed')
    plt.show()

We will define a loss metric - namely - Mean Absolute Percentage Error which calculated Mean Absolute Error in percentage

In [31]:
def mean_absolute_percentage_error(y_true, y_pred): 
    return np.mean(np.abs((y_true - y_pred) / y_true)) * 100

Let's see the plot for Linear Regression

In [32]:
plotModelResults(lr, X_train_scaled, X_test_scaled, y_train, y_test, plot_intervals=True, plot_anomalies=True)

This plot does not tell us much apart from the fact that our model has faired poorly in predicting the pattern.

Lets see the plot for coefficients

In [33]:
plotCoefficients(lr)

Lets see the correlation matrix and Heat Map for the features

In [34]:
import seaborn as sns
plt.figure(figsize=(15,10))
sns.heatmap(X_train.corr())
Out[34]:
<matplotlib.axes._subplots.AxesSubplot at 0x11d73f60>

Not much information can be derived from this Heat Map - only crucial information is prices in few years are completely uncorrelated.

Lets create our next model - Lasso Regression

In [35]:
lasso = LassoCV(cv =tscv, max_iter=10000)
lasso.fit(X_train_scaled, y_train)
Out[35]:
LassoCV(alphas=None, copy_X=True,
    cv=TimeSeriesSplit(max_train_size=None, n_splits=5), eps=0.001,
    fit_intercept=True, max_iter=10000, n_alphas=100, n_jobs=None,
    normalize=False, positive=False, precompute='auto', random_state=None,
    selection='cyclic', tol=0.0001, verbose=False)
In [36]:
plotModelResults(lasso, 
                 X_train_scaled, 
                 X_test_scaled,
                 y_train, 
                 y_test,
                 plot_intervals=True, plot_anomalies=True)
plotCoefficients(lasso)
In [37]:
coef = getCoefficients(lasso)
np.count_nonzero(np.where(coef['coef']==0.000000))
Out[37]:
181

Oh wow!

Around 181 features were of no value which are elimiated by the Lasso Regression

Let's see important features (Top10)

In [38]:
coef.sort_values(by='coef', ascending=False).head(10)
Out[38]:
coef
lag_1 3.550651e+01
year_2009 5.308872e-02
year_2010 3.609759e-02
week_32 3.299126e-02
lag_8 2.146558e-03
week_year_32 4.339372e-07
lag_11 0.000000e+00
year_1993 -0.000000e+00
year_1992 -0.000000e+00
lag_3 0.000000e+00

It turns out that Lag 1 is the most important feature

Lets see how close our predicted values are compared to actual values

In [39]:
from sklearn.linear_model import Lasso
lasso = Lasso(max_iter=10000, random_state=17)

lasso.fit(X_train_scaled, y_train)
y_pred = lasso.predict(X_test_scaled)

columns = ['Close_actual', 'Close_pred']
df_pred_lasso = pd.DataFrame(columns = columns)

df_pred_lasso.Close_actual = y_test
df_pred_lasso.Close_pred = y_pred
In [40]:
plt.figure(figsize=(15,8))
plt.plot(df_pred_lasso)
plt.plot(df_pred_lasso.Close_pred, "b--", label="prediction", linewidth=1.0)
plt.plot(df_pred_lasso.Close_actual, "r--", label="actual", linewidth=1)
plt.legend(loc="best")
Out[40]:
<matplotlib.legend.Legend at 0x12bc6c18>
In [41]:
df_pred_lasso['diff'] = df_pred_lasso.Close_actual - df_pred_lasso.Close_pred
df_pred_lasso['perc_diff'] = ((df_pred_lasso['diff']) / (df_pred_lasso['Close_pred']))

df_pred_lasso.head(20)
Out[41]:
Close_actual Close_pred diff perc_diff
4700 203.47 195.652330 7.81726 0.0399548
4701 204.664 198.977729 5.68589 0.0285755
4702 199.958 200.140080 -0.181813 -0.000908429
4703 202.19 195.559567 6.63034 0.0339045
4704 203.089 197.732002 5.35734 0.0270939
4705 200.235 198.607568 1.62781 0.00819613
4706 201.87 195.829330 6.04064 0.0308464
4707 198.714 197.420545 1.293 0.00654948
4708 205.079 194.347871 10.731 0.0552153
4709 206.652 200.544286 6.10794 0.0304568
4710 206.877 202.075922 4.80096 0.0237582
4711 206.281 202.294616 3.9861 0.0197044
4712 200.469 201.714272 -1.24551 -0.00617461
4713 202.475 196.056521 6.41832 0.0327371
4714 199.647 198.009370 1.63773 0.00827095
4715 205.061 195.256651 9.80469 0.0502144
4716 213.45 200.527248 12.9229 0.0644448
4717 219.538 208.693534 10.8449 0.0519655
4718 219.651 214.620201 5.03094 0.0234411
4719 225.29 214.729957 10.5601 0.0491784

Amazing!!

Lasso Regression has done a very nice job in predicting the adjusted closing price of this stock

We can also run PCA to eliminate more features and noises from the data

In [42]:
from sklearn.decomposition import PCA
from sklearn.pipeline import make_pipeline

def plotPCA(pca):
    """
    Plots accumulated percentage of explained variance by component
    
    pca: fitted PCA object
    """
    components = range(1, pca.n_components_ + 1)
    variance = np.cumsum(np.round(pca.explained_variance_ratio_, decimals=4)*100)
    plt.figure(figsize=(20, 10))
    plt.bar(components, variance)
    
    # additionally mark the level of 95% of explained variance 
    plt.hlines(y = 95, xmin=0, xmax=len(components), linestyles='dashed', colors='red')
    
    plt.xlabel('PCA components')
    plt.ylabel('variance')
    plt.xticks(components)
    plt.show()
In [43]:
# Create PCA object: pca
pca = PCA()


# Train PCA on scaled data
pca = pca.fit(X_train_scaled)

# plot explained variance
plotPCA(pca)
In [44]:
pca_comp = PCA(0.95).fit(X_train_scaled)
print('We need %d components to explain 95%% of variance' 
      % pca_comp.n_components_)
We need 73 components to explain 95% of variance

PCA needs only 73 components to explain the variance.

Lets fit and transform train and test data with these components

In [45]:
pca = PCA(n_components=pca_comp.n_components).fit(X_train_scaled)

pca_features_train = pca.transform(X_train_scaled)
pca_features_test = pca.transform(X_test_scaled)

Lets run the Linear Regression model once again to see if there are any improvements since last time

In [46]:
lr.fit(pca_features_train, y_train)
Out[46]:
LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None,
         normalize=False)
In [47]:
plotModelResults(lr, pca_features_train, pca_features_test, y_train, y_test, plot_intervals=True, plot_anomalies=True)

Super!

PCA has resulted into an improvement in the linear regression model

Lets run another model - Ridge Regression and see how does it fare

In [48]:
from sklearn.linear_model import Ridge
ridge = Ridge(max_iter=10000, random_state=17)

ridge.fit(X_train_scaled, y_train)
y_pred = ridge.predict(X_test_scaled)

columns = ['Close_actual', 'Close_pred']
df_pred_ridge = pd.DataFrame(columns = columns)

df_pred_ridge.Close_actual = y_test
df_pred_ridge.Close_pred = y_pred
In [49]:
plt.figure(figsize=(15,8))
plt.plot(df_pred_ridge)
plt.plot(df_pred_ridge.Close_pred, "b--", label="prediction", linewidth=1.0)
plt.plot(df_pred_ridge.Close_actual, "r--", label="actual", linewidth=1.0)
plt.legend(loc="best")
Out[49]:
<matplotlib.legend.Legend at 0x10f4eeb8>
In [50]:
df_pred_ridge['diff'] = df_pred_ridge.Close_actual - df_pred_ridge.Close_pred
df_pred_ridge['perc_diff'] = ((df_pred_ridge['diff']) / (df_pred_ridge['Close_pred']))*100
df_pred_ridge.head(20)
Out[50]:
Close_actual Close_pred diff perc_diff
4700 203.47 200.768340 2.70125 1.34546
4701 204.664 203.476201 1.18742 0.583567
4702 199.958 204.929404 -4.97114 -2.42578
4703 202.19 200.250048 1.93986 0.96872
4704 203.089 201.988105 1.10124 0.545198
4705 200.235 203.377847 -3.14247 -1.54514
4706 201.87 200.325602 1.54436 0.770926
4707 198.714 202.340981 -3.62743 -1.79273
4708 205.079 199.281304 5.79754 2.90922
4709 206.652 204.530661 2.12156 1.03728
4710 206.877 206.756599 0.120278 0.0581737
4711 206.281 206.377177 -0.0964613 -0.0467403
4712 200.469 206.263468 -5.7947 -2.80937
4713 202.475 201.571126 0.903712 0.448334
4714 199.647 202.145811 -2.49872 -1.2361
4715 205.061 200.438312 4.62303 2.30646
4716 213.45 204.856249 8.59395 4.19511
4717 219.538 213.327969 6.21042 2.91121
4718 219.651 219.360145 0.290993 0.132655
4719 225.29 219.161219 6.1288 2.79648

Not Bad at all!

Lasso and Ridge turned out to quite close and already are superstars

Lets see the plots for Ridge

In [51]:
from sklearn.linear_model import RidgeCV
ridge = RidgeCV(cv=tscv)
ridge.fit(X_train_scaled, y_train)

plotModelResults(ridge, X_train_scaled, X_test_scaled, y_train, y_test, plot_intervals=True, plot_anomalies=True)
plotCoefficients(ridge)