skorch
is designed to maximize interoperability between sklearn
and pytorch
. The aim is to keep 99% of the flexibility of pytorch
while being able to leverage most features of sklearn
. Below, we show the basic usage of skorch
and how it can be combined with sklearn
.
This notebook shows you how to use the basic functionality of skorch
.
import torch
from torch import nn
import torch.nn.functional as F
torch.manual_seed(0);
We load a toy classification task from sklearn
.
import numpy as np
from sklearn.datasets import make_classification
X, y = make_classification(1000, 20, n_informative=10, random_state=0)
X = X.astype(np.float32)
X.shape, y.shape, y.mean()
((1000, 20), (1000,), 0.5)
pytorch
classification module
¶We define a vanilla neural network with two hidden layers. The output layer should have 2 output units since there are two classes. In addition, it should have a softmax nonlinearity, because later, when calling predict_proba
, the output from the forward
call will be used.
class ClassifierModule(nn.Module):
def __init__(
self,
num_units=10,
nonlin=F.relu,
dropout=0.5,
):
super(ClassifierModule, self).__init__()
self.num_units = num_units
self.nonlin = nonlin
self.dropout = dropout
self.dense0 = nn.Linear(20, num_units)
self.nonlin = nonlin
self.dropout = nn.Dropout(dropout)
self.dense1 = nn.Linear(num_units, 10)
self.output = nn.Linear(10, 2)
def forward(self, X, **kwargs):
X = self.nonlin(self.dense0(X))
X = self.dropout(X)
X = F.relu(self.dense1(X))
X = F.softmax(self.output(X), dim=-1)
return X
We use NeuralNetClassifier
because we're dealing with a classifcation task. The first argument should be the pytorch module
. As additional arguments, we pass the number of epochs and the learning rate (lr
), but those are optional.
Note: To use the CUDA backend, pass device='cuda'
as an additional argument.
from skorch import NeuralNetClassifier
net = NeuralNetClassifier(
ClassifierModule,
max_epochs=20,
lr=0.1,
# device='cuda', # uncomment this to train with CUDA
)
As in sklearn
, we call fit
passing the input data X
and the targets y
. By default, NeuralNetClassifier
makes a StratifiedKFold
split on the data (80/20) to track the validation loss. This is shown, as well as the train loss and the accuracy on the validation set.
pdb on
Automatic pdb calling has been turned ON
net.fit(X, y)
epoch train_loss valid_acc valid_loss dur ------- ------------ ----------- ------------ ------ 1 0.6905 0.6150 0.6749 0.2752 2 0.6648 0.6450 0.6633 0.0091 3 0.6619 0.6750 0.6533 0.0088 4 0.6429 0.6800 0.6399 0.0090 5 0.6307 0.6950 0.6254 0.0090 6 0.6291 0.7000 0.6134 0.0090 7 0.6102 0.7100 0.6033 0.0087 8 0.6050 0.7000 0.5931 0.0091 9 0.5966 0.7000 0.5844 0.0090 10 0.5636 0.7100 0.5689 0.0091 11 0.5757 0.7200 0.5628 0.0089 12 0.5757 0.7200 0.5520 0.0090 13 0.5559 0.7300 0.5459 0.0089 14 0.5541 0.7300 0.5424 0.0089 15 0.5659 0.7350 0.5378 0.0089 16 0.5364 0.7350 0.5322 0.0089 17 0.5456 0.7300 0.5239 0.0090 18 0.5476 0.7450 0.5260 0.0089 19 0.5499 0.7500 0.5249 0.0088 20 0.5273 0.7350 0.5251 0.0089
<class 'skorch.classifier.NeuralNetClassifier'>[initialized]( module_=ClassifierModule( (dense0): Linear(in_features=20, out_features=10, bias=True) (dropout): Dropout(p=0.5) (dense1): Linear(in_features=10, out_features=10, bias=True) (output): Linear(in_features=10, out_features=2, bias=True) ), )
Also, as in sklearn
, you may call predict
or predict_proba
on the fitted model.
y_pred = net.predict(X[:5])
y_pred
array([0, 0, 0, 0, 0])
y_proba = net.predict_proba(X[:5])
y_proba
array([[0.5349464 , 0.46505365], [0.8685093 , 0.1314907 ], [0.6860039 , 0.31399614], [0.9126012 , 0.08739878], [0.69675475, 0.30324525]], dtype=float32)
from sklearn.datasets import make_regression
X_regr, y_regr = make_regression(1000, 20, n_informative=10, random_state=0)
X_regr = X_regr.astype(np.float32)
y_regr = y_regr.astype(np.float32) / 100
y_regr = y_regr.reshape(-1, 1)
X_regr.shape, y_regr.shape, y_regr.min(), y_regr.max()
((1000, 20), (1000, 1), -6.4901485, 6.154505)
Note: Regression currently requires the target to be 2-dimensional, hence the need to reshape. This should be fixed with an upcoming version of pytorch.
pytorch
regression module
¶Again, define a vanilla neural network with two hidden layers. The main difference is that the output layer only has one unit and does not apply a softmax nonlinearity.
class RegressorModule(nn.Module):
def __init__(
self,
num_units=10,
nonlin=F.relu,
):
super(RegressorModule, self).__init__()
self.num_units = num_units
self.nonlin = nonlin
self.dense0 = nn.Linear(20, num_units)
self.nonlin = nonlin
self.dense1 = nn.Linear(num_units, 10)
self.output = nn.Linear(10, 1)
def forward(self, X, **kwargs):
X = self.nonlin(self.dense0(X))
X = F.relu(self.dense1(X))
X = self.output(X)
return X
Training a regressor is almost the same as training a classifier. Mainly, we use NeuralNetRegressor
instead of NeuralNetClassifier
(this is the same terminology as in sklearn
).
from skorch import NeuralNetRegressor
net_regr = NeuralNetRegressor(
RegressorModule,
max_epochs=20,
lr=0.1,
# device='cuda', # uncomment this to train with CUDA
)
net_regr.fit(X_regr, y_regr)
epoch train_loss valid_loss dur ------- ------------ ------------ ------ 1 4.4168 3.0788 0.0252 2 2.0120 0.4565 0.0100 3 0.3343 0.2262 0.0098 4 0.1851 0.2223 0.0098 5 0.1491 0.1068 0.0098 6 0.0946 0.1207 0.0099 7 0.0739 0.0663 0.0098 8 0.0554 0.0706 0.0097 9 0.0437 0.0461 0.0099 10 0.0372 0.0469 0.0097 11 0.0291 0.0343 0.0098 12 0.0270 0.0333 0.0099 13 0.0207 0.0265 0.0098 14 0.0196 0.0249 0.0098 15 0.0152 0.0215 0.0099 16 0.0151 0.0198 0.0098 17 0.0120 0.0182 0.0097 18 0.0119 0.0167 0.0099 19 0.0100 0.0159 0.0097 20 0.0097 0.0149 0.0098
<class 'skorch.regressor.NeuralNetRegressor'>[initialized]( module_=RegressorModule( (dense0): Linear(in_features=20, out_features=10, bias=True) (dense1): Linear(in_features=10, out_features=10, bias=True) (output): Linear(in_features=10, out_features=1, bias=True) ), )
You may call predict
or predict_proba
on the fitted model. For regressions, both methods return the same value.
y_pred = net_regr.predict(X_regr[:5])
y_pred
array([[ 0.4903931 ], [-1.4224019 ], [-0.77500594], [-0.06901944], [-0.3867012 ]], dtype=float32)
Save and load either the whole model by using pickle or just the learned model parameters by calling save_params
and load_params
.
import pickle
file_name = '/tmp/mymodel.pkl'
with open(file_name, 'wb') as f:
pickle.dump(net, f)
/home/thomasfan/anaconda3/lib/python3.6/site-packages/torch/serialization.py:241: UserWarning: Couldn't retrieve source code for container of type ClassifierModule. It won't be checked for correctness upon loading. "type " + obj.__name__ + ". It won't be checked "
with open(file_name, 'rb') as f:
new_net = pickle.load(f)
This only saves and loads the proper module
parameters, meaning that hyperparameters such as lr
and max_epochs
are not saved. Therefore, to load the model, we have to re-initialize it beforehand.
net.save_params(f_params=file_name) # a file handler also works
# first initialize the model
new_net = NeuralNetClassifier(
ClassifierModule,
max_epochs=20,
lr=0.1,
).initialize()
new_net.load_params(file_name)
sklearn Pipeline
¶It is possible to put the NeuralNetClassifier
inside an sklearn Pipeline
, as you would with any sklearn
classifier.
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
pipe = Pipeline([
('scale', StandardScaler()),
('net', net),
])
pipe.fit(X, y)
Re-initializing module! epoch train_loss valid_acc valid_loss dur ------- ------------ ----------- ------------ ------ 1 0.7243 0.5000 0.7105 0.0087 2 0.7057 0.5000 0.6996 0.0089 3 0.6971 0.5000 0.6949 0.0089 4 0.6936 0.5050 0.6929 0.0089 5 0.6923 0.5400 0.6916 0.0088 6 0.6905 0.5000 0.6906 0.0088 7 0.6894 0.5100 0.6899 0.0088 8 0.6891 0.5150 0.6892 0.0088 9 0.6899 0.5250 0.6885 0.0088 10 0.6844 0.5300 0.6876 0.0088 11 0.6853 0.5650 0.6865 0.0089 12 0.6842 0.5700 0.6855 0.0088 13 0.6821 0.5850 0.6844 0.0089 14 0.6821 0.6050 0.6832 0.0088 15 0.6820 0.6100 0.6820 0.0088 16 0.6769 0.6100 0.6800 0.0088 17 0.6784 0.6200 0.6780 0.0088 18 0.6763 0.6450 0.6761 0.0088 19 0.6704 0.6550 0.6729 0.0088 20 0.6691 0.6750 0.6699 0.0088
Pipeline(memory=None, steps=[('scale', StandardScaler(copy=True, with_mean=True, with_std=True)), ('net', <class 'skorch.classifier.NeuralNetClassifier'>[initialized]( module_=ClassifierModule( (dense0): Linear(in_features=20, out_features=10, bias=True) (dropout): Dropout(p=0.5) (dense1): Linear(in_features=10, out_features=10, bias=True) (output): Linear(in_features=10, out_features=2, bias=True) ), ))])
y_proba = pipe.predict_proba(X[:5])
y_proba
array([[0.5064775 , 0.49352255], [0.5324396 , 0.46756035], [0.57306874, 0.42693123], [0.54179883, 0.4582012 ], [0.5528906 , 0.44710943]], dtype=float32)
To save the whole pipeline, including the pytorch module, use pickle
.
Adding a new callback to the model is straightforward. Below we show how to add a new callback that determines the area under the ROC (AUC) score.
from skorch.callbacks import EpochScoring
There is a scoring callback in skorch, EpochScoring
, which we use for this. We have to specify which score to calculate. We have 3 choices:
sklearn
metric. For a list of all existing scores, look here.None
: If you implement your own .score
method on your neural net, passing scoring=None
will tell skorch
to use that.func(model, X, y) -> score
, which is then used.Note that this works exactly the same as scoring in sklearn
does.
For our case here, since sklearn
already implements AUC, we just pass the correct string 'roc_auc'
. We should also tell the callback that higher scores are better (to get the correct colors printed below -- by default, lower scores are assumed to be better). Furthermore, we may specify a name
argument for EpochScoring
, and whether to use training data (by setting on_train=True
) or validation data (which is the default).
auc = EpochScoring(scoring='roc_auc', lower_is_better=False)
Finally, we pass the scoring callback to the callbacks
parameter as a list and then call fit
. Notice that we get the printed scores and color highlighting for free.
net = NeuralNetClassifier(
ClassifierModule,
max_epochs=20,
lr=0.1,
callbacks=[auc],
)
net.fit(X, y)
epoch roc_auc train_loss valid_acc valid_loss dur ------- --------- ------------ ----------- ------------ ------ 1 0.6112 0.7076 0.5550 0.6802 0.0088 2 0.6766 0.6750 0.6150 0.6626 0.0091 3 0.7031 0.6560 0.6500 0.6498 0.0089 4 0.7201 0.6364 0.6650 0.6381 0.0089 5 0.7316 0.6176 0.6900 0.6285 0.0089 6 0.7447 0.6094 0.7200 0.6183 0.0088 7 0.7522 0.6170 0.7200 0.6090 0.0088 8 0.7567 0.5786 0.7150 0.6032 0.0089 9 0.7630 0.5850 0.7100 0.5954 0.0089 10 0.7706 0.5770 0.7200 0.5889 0.0089 11 0.7735 0.5740 0.7050 0.5842 0.0089 12 0.7729 0.5771 0.7100 0.5859 0.0089 13 0.7792 0.5557 0.7000 0.5745 0.0089 14 0.7825 0.5810 0.7050 0.5691 0.0088 15 0.7824 0.5634 0.7200 0.5691 0.0089 16 0.7817 0.5778 0.7150 0.5704 0.0088 17 0.7871 0.5624 0.7150 0.5633 0.0091 18 0.7855 0.5613 0.7200 0.5660 0.0088 19 0.7792 0.5637 0.7250 0.5722 0.0089 20 0.7823 0.5516 0.7150 0.5681 0.0089
<class 'skorch.classifier.NeuralNetClassifier'>[initialized]( module_=ClassifierModule( (dense0): Linear(in_features=20, out_features=10, bias=True) (dropout): Dropout(p=0.5) (dense1): Linear(in_features=10, out_features=10, bias=True) (output): Linear(in_features=10, out_features=2, bias=True) ), )
For information on how to write custom callbacks, have a look at the Advanced_Usage notebook.
GridSearchCV
¶The NeuralNet
class allows to directly access parameters of the pytorch module
by using the module__
prefix. So e.g. if you defined the module
to have a num_units
parameter, you can set it via the module__num_units
argument. This is exactly the same logic that allows to access estimator parameters in sklearn Pipeline
s and FeatureUnion
s.
This feature is useful in several ways. For one, it allows to set those parameters in the model definition. Furthermore, it allows you to set parameters in an sklearn GridSearchCV
as shown below.
In addition to the parameters prefixed by module__
, you may access a couple of other attributes, such as those of the optimizer by using the optimizer__
prefix (again, see below). All those special prefixes are stored in the prefixes_
attribute:
print(', '.join(net.prefixes_))
module, iterator_train, iterator_valid, optimizer, criterion, callbacks, dataset
Below we show how to perform a grid search over the learning rate (lr
), the module's number of hidden units (module__num_units
), the module's dropout rate (module__dropout
), and whether the SGD optimizer should use Nesterov momentum or not (optimizer__nesterov
).
from sklearn.model_selection import GridSearchCV
net = NeuralNetClassifier(
ClassifierModule,
max_epochs=20,
lr=0.1,
verbose=0,
optimizer__momentum=0.9,
)
params = {
'lr': [0.05, 0.1],
'module__num_units': [10, 20],
'module__dropout': [0, 0.5],
'optimizer__nesterov': [False, True],
}
gs = GridSearchCV(net, params, refit=False, cv=3, scoring='accuracy', verbose=2)
gs.fit(X, y)
Fitting 3 folds for each of 16 candidates, totalling 48 fits [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=False, total= 0.2s [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=False, total= 0.2s [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=False
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.2s remaining: 0.0s
[CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=False, total= 0.2s [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=True, total= 0.2s [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=True, total= 0.2s [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0, module__num_units=10, optimizer__nesterov=True, total= 0.2s [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=False, total= 0.2s [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=False, total= 0.2s [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=False, total= 0.2s [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=True, total= 0.2s [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=True, total= 0.2s [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0, module__num_units=20, optimizer__nesterov=True, total= 0.2s [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False, total= 0.2s [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False, total= 0.2s [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False, total= 0.2s [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True, total= 0.2s [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True, total= 0.2s [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True, total= 0.2s [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False, total= 0.2s [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False, total= 0.2s [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False, total= 0.2s [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True, total= 0.2s [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True, total= 0.2s [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True [CV] lr=0.05, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True, total= 0.2s [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=False, total= 0.2s [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=False, total= 0.2s [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=False, total= 0.2s [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=True, total= 0.2s [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=True, total= 0.2s [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0, module__num_units=10, optimizer__nesterov=True, total= 0.2s [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=False, total= 0.2s [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=False, total= 0.2s [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=False, total= 0.2s [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=True, total= 0.2s [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=True, total= 0.2s [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0, module__num_units=20, optimizer__nesterov=True, total= 0.2s [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False, total= 0.2s [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False, total= 0.2s [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=False, total= 0.2s [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True, total= 0.2s [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True, total= 0.2s [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0.5, module__num_units=10, optimizer__nesterov=True, total= 0.2s [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False, total= 0.2s [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False, total= 0.2s [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=False, total= 0.2s [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True, total= 0.2s [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True, total= 0.2s [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True [CV] lr=0.1, module__dropout=0.5, module__num_units=20, optimizer__nesterov=True, total= 0.2s
[Parallel(n_jobs=1)]: Done 48 out of 48 | elapsed: 8.2s finished
GridSearchCV(cv=3, error_score='raise', estimator=<class 'skorch.classifier.NeuralNetClassifier'>[uninitialized]( module=<class '__main__.ClassifierModule'>, ), fit_params=None, iid=True, n_jobs=1, param_grid={'lr': [0.05, 0.1], 'module__num_units': [10, 20], 'module__dropout': [0, 0.5], 'optimizer__nesterov': [False, True]}, pre_dispatch='2*n_jobs', refit=False, return_train_score='warn', scoring='accuracy', verbose=2)
print(gs.best_score_, gs.best_params_)
0.862 {'lr': 0.05, 'module__dropout': 0, 'module__num_units': 20, 'optimizer__nesterov': False}
Of course, we could further nest the NeuralNetClassifier
within an sklearn Pipeline
, in which case we just prefix the parameter by the name of the net (e.g. net__module__num_units
).