skorch
is designed to maximize interoperability between sklearn
and pytorch
. The aim is to keep 99% of the flexibility of pytorch
while being able to leverage most features of sklearn
. Below, we show the basic usage of skorch
and how it can be combined with sklearn
.
Run in Google Colab | View source on GitHub |
This notebook shows you how to use the basic functionality of skorch
.
! [ ! -z "$COLAB_GPU" ] && pip install torch skorch
import torch
from torch import nn
import torch.nn.functional as F
torch.manual_seed(0);
We load a toy classification task from sklearn
.
import numpy as np
from sklearn.datasets import make_classification
X, y = make_classification(1000, 20, n_informative=10, random_state=0)
X, y = X.astype(np.float32), y.astype(np.int64)
X.shape, y.shape, y.mean()
((1000, 20), (1000,), 0.5)
pytorch
classification module
¶We define a vanilla neural network with two hidden layers. The output layer should have 2 output units since there are two classes. In addition, it should have a softmax nonlinearity, because later, when calling predict_proba
, the output from the forward
call will be used.
class ClassifierModule(nn.Module):
def __init__(
self,
num_units=10,
nonlin=F.relu,
dropout=0.5,
):
super(ClassifierModule, self).__init__()
self.num_units = num_units
self.nonlin = nonlin
self.dropout = dropout
self.dense0 = nn.Linear(20, num_units)
self.nonlin = nonlin
self.dropout = nn.Dropout(dropout)
self.dense1 = nn.Linear(num_units, 10)
self.output = nn.Linear(10, 2)
def forward(self, X, **kwargs):
X = self.nonlin(self.dense0(X))
X = self.dropout(X)
X = F.relu(self.dense1(X))
X = F.softmax(self.output(X), dim=-1)
return X
We use NeuralNetClassifier
because we're dealing with a classifcation task. The first argument should be the pytorch module
. As additional arguments, we pass the number of epochs and the learning rate (lr
), but those are optional.
Note: To use the CUDA backend, pass device='cuda'
as an additional argument.
from skorch import NeuralNetClassifier
net = NeuralNetClassifier(
ClassifierModule,
max_epochs=20,
lr=0.1,
# device='cuda', # uncomment this to train with CUDA
)
As in sklearn
, we call fit
passing the input data X
and the targets y
. By default, NeuralNetClassifier
makes a StratifiedKFold
split on the data (80/20) to track the validation loss. This is shown, as well as the train loss and the accuracy on the validation set.
net.fit(X, y)
epoch train_loss valid_acc valid_loss dur ------- ------------ ----------- ------------ ------ 1 0.6905 0.6150 0.6749 0.0235 2 0.6648 0.6450 0.6633 0.0213 3 0.6619 0.6750 0.6533 0.0219 4 0.6429 0.6800 0.6399 0.0207 5 0.6307 0.6950 0.6254 0.0192 6 0.6291 0.7000 0.6134 0.0202 7 0.6102 0.7100 0.6033 0.0220 8 0.6050 0.7000 0.5931 0.0210 9 0.5966 0.7000 0.5844 0.0217 10 0.5636 0.7100 0.5689 0.0226 11 0.5757 0.7200 0.5628 0.0196 12 0.5757 0.7200 0.5520 0.0190 13 0.5559 0.7300 0.5459 0.0218 14 0.5541 0.7300 0.5424 0.0206 15 0.5659 0.7350 0.5378 0.0215 16 0.5364 0.7350 0.5322 0.0192 17 0.5456 0.7300 0.5239 0.0221 18 0.5476 0.7450 0.5260 0.0188 19 0.5499 0.7500 0.5249 0.0213 20 0.5273 0.7350 0.5251 0.0206
<class 'skorch.classifier.NeuralNetClassifier'>[initialized]( module_=ClassifierModule( (dense0): Linear(in_features=20, out_features=10, bias=True) (dropout): Dropout(p=0.5) (dense1): Linear(in_features=10, out_features=10, bias=True) (output): Linear(in_features=10, out_features=2, bias=True) ), )
Also, as in sklearn
, you may call predict
or predict_proba
on the fitted model.
y_pred = net.predict(X[:5])
y_pred
array([0, 0, 0, 0, 0])
y_proba = net.predict_proba(X[:5])
y_proba
array([[0.5349464 , 0.46505365], [0.8685093 , 0.1314907 ], [0.6860039 , 0.31399614], [0.9126012 , 0.08739878], [0.69675475, 0.30324525]], dtype=float32)
from sklearn.datasets import make_regression
X_regr, y_regr = make_regression(1000, 20, n_informative=10, random_state=0)
X_regr = X_regr.astype(np.float32)
y_regr = y_regr.astype(np.float32) / 100
y_regr = y_regr.reshape(-1, 1)
X_regr.shape, y_regr.shape, y_regr.min(), y_regr.max()
((1000, 20), (1000, 1), -6.4901485, 6.154505)
Note: Regression currently requires the target to be 2-dimensional, hence the need to reshape. This should be fixed with an upcoming version of pytorch.
pytorch
regression module
¶Again, define a vanilla neural network with two hidden layers. The main difference is that the output layer only has one unit and does not apply a softmax nonlinearity.
class RegressorModule(nn.Module):
def __init__(
self,
num_units=10,
nonlin=F.relu,
):
super(RegressorModule, self).__init__()
self.num_units = num_units
self.nonlin = nonlin
self.dense0 = nn.Linear(20, num_units)
self.nonlin = nonlin
self.dense1 = nn.Linear(num_units, 10)
self.output = nn.Linear(10, 1)
def forward(self, X, **kwargs):
X = self.nonlin(self.dense0(X))
X = F.relu(self.dense1(X))
X = self.output(X)
return X
Training a regressor is almost the same as training a classifier. Mainly, we use NeuralNetRegressor
instead of NeuralNetClassifier
(this is the same terminology as in sklearn
).
from skorch import NeuralNetRegressor
net_regr = NeuralNetRegressor(
RegressorModule,
max_epochs=20,
lr=0.1,
# device='cuda', # uncomment this to train with CUDA
)
net_regr.fit(X_regr, y_regr)
epoch train_loss valid_loss dur ------- ------------ ------------ ------ 1 4.4168 3.0788 0.0292 2 2.0120 0.4565 0.0270 3 0.3343 0.2262 0.0263 4 0.1851 0.2223 0.0257 5 0.1491 0.1068 0.0242 6 0.0946 0.1207 0.0263 7 0.0739 0.0663 0.0290 8 0.0554 0.0706 0.0298 9 0.0437 0.0461 0.0337 10 0.0372 0.0469 0.0273 11 0.0291 0.0343 0.0263 12 0.0270 0.0333 0.0285 13 0.0207 0.0265 0.0281 14 0.0196 0.0249 0.0344 15 0.0152 0.0215 0.0286 16 0.0151 0.0198 0.0281 17 0.0120 0.0182 0.0283 18 0.0119 0.0167 0.0266 19 0.0100 0.0159 0.0266 20 0.0097 0.0149 0.0259
<class 'skorch.regressor.NeuralNetRegressor'>[initialized]( module_=RegressorModule( (dense0): Linear(in_features=20, out_features=10, bias=True) (dense1): Linear(in_features=10, out_features=10, bias=True) (output): Linear(in_features=10, out_features=1, bias=True) ), )
You may call predict
or predict_proba
on the fitted model. For regressions, both methods return the same value.
y_pred = net_regr.predict(X_regr[:5])
y_pred
array([[ 0.4903931 ], [-1.4224019 ], [-0.77500594], [-0.06901944], [-0.3867012 ]], dtype=float32)
Save and load either the whole model by using pickle or just the learned model parameters by calling save_params
and load_params
.
import pickle
file_name = '/tmp/mymodel.pkl'
with open(file_name, 'wb') as f:
pickle.dump(net, f)
/Users/thomasfan/anaconda3/lib/python3.7/site-packages/torch/serialization.py:241: UserWarning: Couldn't retrieve source code for container of type ClassifierModule. It won't be checked for correctness upon loading. "type " + obj.__name__ + ". It won't be checked "
with open(file_name, 'rb') as f:
new_net = pickle.load(f)
This only saves and loads the proper module
parameters, meaning that hyperparameters such as lr
and max_epochs
are not saved. Therefore, to load the model, we have to re-initialize it beforehand.
net.save_params(f_params=file_name) # a file handler also works
# first initialize the model
new_net = NeuralNetClassifier(
ClassifierModule,
max_epochs=20,
lr=0.1,
).initialize()
new_net.load_params(file_name)
sklearn Pipeline
¶It is possible to put the NeuralNetClassifier
inside an sklearn Pipeline
, as you would with any sklearn
classifier.
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
pipe = Pipeline([
('scale', StandardScaler()),
('net', net),
])
pipe.fit(X, y)
Re-initializing module! epoch train_loss valid_acc valid_loss dur ------- ------------ ----------- ------------ ------ 1 0.7243 0.5000 0.7105 0.0184 2 0.7057 0.5000 0.6996 0.0207 3 0.6971 0.5000 0.6949 0.0192 4 0.6936 0.5050 0.6929 0.0224 5 0.6923 0.5400 0.6916 0.0210 6 0.6905 0.5000 0.6906 0.0189 7 0.6894 0.5100 0.6899 0.0194 8 0.6891 0.5150 0.6892 0.0186 9 0.6899 0.5250 0.6885 0.0202 10 0.6844 0.5300 0.6876 0.0189 11 0.6853 0.5650 0.6865 0.0199 12 0.6842 0.5700 0.6855 0.0183 13 0.6821 0.5850 0.6844 0.0199 14 0.6821 0.6050 0.6832 0.0189 15 0.6820 0.6100 0.6820 0.0206 16 0.6769 0.6100 0.6800 0.0188 17 0.6784 0.6200 0.6780 0.0219 18 0.6763 0.6450 0.6761 0.0233 19 0.6704 0.6550 0.6729 0.0254 20 0.6691 0.6750 0.6699 0.0252
Pipeline(memory=None, steps=[('scale', StandardScaler(copy=True, with_mean=True, with_std=True)), ('net', <class 'skorch.classifier.NeuralNetClassifier'>[initialized]( module_=ClassifierModule( (dense0): Linear(in_features=20, out_features=10, bias=True) (dropout): Dropout(p=0.5) (dense1): Linear(in_features=10, out_features=10, bias=True) (output): Linear(in_features=10, out_features=2, bias=True) ), ))])
y_proba = pipe.predict_proba(X[:5])
y_proba
array([[0.5064775 , 0.49352255], [0.53243965, 0.46756038], [0.57306874, 0.42693123], [0.54179883, 0.45820117], [0.5528906 , 0.44710937]], dtype=float32)
To save the whole pipeline, including the pytorch module, use pickle
.
Adding a new callback to the model is straightforward. Below we show how to add a new callback that determines the area under the ROC (AUC) score.
from skorch.callbacks import EpochScoring
There is a scoring callback in skorch, EpochScoring
, which we use for this. We have to specify which score to calculate. We have 3 choices:
sklearn
metric. For a list of all existing scores, look here.None
: If you implement your own .score
method on your neural net, passing scoring=None
will tell skorch
to use that.func(model, X, y) -> score
, which is then used.Note that this works exactly the same as scoring in sklearn
does.
For our case here, since sklearn
already implements AUC, we just pass the correct string 'roc_auc'
. We should also tell the callback that higher scores are better (to get the correct colors printed below -- by default, lower scores are assumed to be better). Furthermore, we may specify a name
argument for EpochScoring
, and whether to use training data (by setting on_train=True
) or validation data (which is the default).
auc = EpochScoring(scoring='roc_auc', lower_is_better=False)
Finally, we pass the scoring callback to the callbacks
parameter as a list and then call fit
. Notice that we get the printed scores and color highlighting for free.
net = NeuralNetClassifier(
ClassifierModule,
max_epochs=20,
lr=0.1,
callbacks=[auc],
)
net.fit(X, y)
epoch roc_auc train_loss valid_acc valid_loss dur ------- --------- ------------ ----------- ------------ ------ 1 0.6112 0.7076 0.5550 0.6802 0.0188 2 0.6766 0.6750 0.6150 0.6626 0.0204 3 0.7031 0.6560 0.6500 0.6498 0.0244 4 0.7201 0.6364 0.6650 0.6381 0.0193 5 0.7316 0.6176 0.6900 0.6285 0.0203 6 0.7447 0.6094 0.7200 0.6183 0.0222 7 0.7522 0.6170 0.7200 0.6090 0.0188 8 0.7567 0.5786 0.7150 0.6032 0.0197 9 0.7630 0.5850 0.7100 0.5954 0.0214 10 0.7706 0.5770 0.7200 0.5889 0.0207 11 0.7735 0.5740 0.7050 0.5842 0.0188 12 0.7729 0.5771 0.7100 0.5859 0.0186 13 0.7792 0.5557 0.7000 0.5745 0.0178 14 0.7825 0.5810 0.7050 0.5691 0.0204 15 0.7824 0.5634 0.7200 0.5691 0.0194 16 0.7817 0.5778 0.7150 0.5704 0.0205 17 0.7871 0.5624 0.7150 0.5633 0.0188 18 0.7855 0.5613 0.7200 0.5660 0.0202 19 0.7792 0.5637 0.7250 0.5722 0.0194 20 0.7823 0.5516 0.7150 0.5681 0.0184
<class 'skorch.classifier.NeuralNetClassifier'>[initialized]( module_=ClassifierModule( (dense0): Linear(in_features=20, out_features=10, bias=True) (dropout): Dropout(p=0.5) (dense1): Linear(in_features=10, out_features=10, bias=True) (output): Linear(in_features=10, out_features=2, bias=True) ), )
For information on how to write custom callbacks, have a look at the Advanced_Usage notebook.
GridSearchCV
¶The NeuralNet
class allows to directly access parameters of the pytorch module
by using the module__
prefix. So e.g. if you defined the module
to have a num_units
parameter, you can set it via the module__num_units
argument. This is exactly the same logic that allows to access estimator parameters in sklearn Pipeline
s and FeatureUnion
s.
This feature is useful in several ways. For one, it allows to set those parameters in the model definition. Furthermore, it allows you to set parameters in an sklearn GridSearchCV
as shown below.
In addition to the parameters prefixed by module__
, you may access a couple of other attributes, such as those of the optimizer by using the optimizer__
prefix (again, see below). All those special prefixes are stored in the prefixes_
attribute:
print(', '.join(net.prefixes_))
module, iterator_train, iterator_valid, optimizer, criterion, callbacks, dataset
Below we show how to perform a grid search over the learning rate (lr
), the module's number of hidden units (module__num_units
), the module's dropout rate (module__dropout
), and whether the SGD optimizer should use Nesterov momentum or not (optimizer__nesterov
).
from sklearn.model_selection import GridSearchCV
net = NeuralNetClassifier(
ClassifierModule,
max_epochs=20,
lr=0.1,
verbose=0,
optimizer__momentum=0.9,
)
params = {
'lr': [0.05, 0.1],
'module__num_units': [10, 20],
'module__dropout': [0, 0.5],
'optimizer__nesterov': [False, True],
}
gs = GridSearchCV(net, params, refit=False, cv=3, scoring='accuracy', verbose=2)
gs.fit(X, y)
Fitting 3 folds for each of 16 candidates, totalling 48 fits [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=False
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=False, total= 0.3s [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=False
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.3s remaining: 0.0s
[CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=False, total= 0.3s [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=False, total= 0.3s [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=True, total= 0.3s [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=True, total= 0.3s [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=True, total= 0.3s [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=False, total= 0.3s [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=False, total= 0.3s [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=False, total= 0.3s [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=True, total= 0.3s [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=True, total= 0.3s [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=True, total= 0.3s [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False, total= 0.3s [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False, total= 0.3s [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False, total= 0.3s [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True, total= 0.3s [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True, total= 0.3s [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True, total= 0.3s [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False, total= 0.3s [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False, total= 0.3s [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False, total= 0.3s [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True, total= 0.3s [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True, total= 0.3s [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True, total= 0.3s [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=False, total= 0.3s [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=False, total= 0.3s [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=False, total= 0.3s [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=True, total= 0.3s [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=True, total= 0.3s [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=True, total= 0.3s [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=False, total= 0.3s [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=False, total= 0.3s [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=False, total= 0.3s [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=True, total= 0.3s [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=True, total= 0.3s [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=True, total= 0.3s [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False, total= 0.3s [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False, total= 0.3s [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False, total= 0.3s [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True, total= 0.3s [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True, total= 0.3s [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True, total= 0.3s [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False, total= 0.3s [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False, total= 0.3s [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False, total= 0.3s [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True, total= 0.3s [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True, total= 0.3s [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True, total= 0.3s
[Parallel(n_jobs=1)]: Done 48 out of 48 | elapsed: 15.7s finished
GridSearchCV(cv=3, error_score='raise-deprecating', estimator=<class 'skorch.classifier.NeuralNetClassifier'>[uninitialized]( module=<class '__main__.ClassifierModule'>, ), fit_params=None, iid='warn', n_jobs=None, param_grid={'lr': [0.05, 0.1], 'module__num_units': [10, 20], 'module__dropout': [0, 0.5], 'optimizer__nesterov': [False, True]}, pre_dispatch='2*n_jobs', refit=False, return_train_score='warn', scoring='accuracy', verbose=2)
print(gs.best_score_, gs.best_params_)
0.862 {'lr': 0.05, 'module__dropout': 0, 'module__num_units': 20, 'optimizer__nesterov': False}
Of course, we could further nest the NeuralNetClassifier
within an sklearn Pipeline
, in which case we just prefix the parameter by the name of the net (e.g. net__module__num_units
).