Deep Learning Models -- A collection of various deep learning architectures, models, and tips for TensorFlow and PyTorch in Jupyter Notebooks.

In [ ]:
%load_ext watermark
%watermark -a 'Sebastian Raschka' -v -p torch

Replacing Fully-Connnected by Equivalent Convolutional Layers

In [15]:
import torch

Assume we have a 2x2 input image:

In [16]:
inputs = torch.tensor([[[[1., 2.],
                         [3., 4.]]]])

inputs.shape
Out[16]:
torch.Size([1, 1, 2, 2])

Fully Connected

A fully connected layer, which maps the 4 input features two 2 outputs, would be computed as follows:

In [17]:
fc = torch.nn.Linear(4, 2)

weights = torch.tensor([[1.1, 1.2, 1.3, 1.4],
                        [1.5, 1.6, 1.7, 1.8]])
bias = torch.tensor([1.9, 2.0])
fc.weight.data = weights
fc.bias.data = bias
In [18]:
torch.relu(fc(inputs.view(-1, 4)))
Out[18]:
tensor([[14.9000, 19.0000]], grad_fn=<ReluBackward0>)

Convolution with Kernels equal to the input size

We can obtain the same outputs if we use convolutional layers where the kernel size is the same size as the input feature array:

In [19]:
conv = torch.nn.Conv2d(in_channels=1,
                       out_channels=2,
                       kernel_size=inputs.squeeze(dim=(0)).squeeze(dim=(0)).size())
print(conv.weight.size())
print(conv.bias.size())
torch.Size([2, 1, 2, 2])
torch.Size([2])
In [20]:
conv.weight.data = weights.view(2, 1, 2, 2)
conv.bias.data = bias
In [21]:
torch.relu(conv(inputs))
Out[21]:
tensor([[[[14.9000]],

         [[19.0000]]]], grad_fn=<ReluBackward0>)

Convolution with 1x1 Kernels

Similarly, we can replace the fully connected layer using a convolutional layer when we reshape the input image into a num_inputs x 1 x 1 image:

In [23]:
conv = torch.nn.Conv2d(in_channels=4,
                       out_channels=2,
                       kernel_size=(1, 1))

conv.weight.data = weights.view(2, 4, 1, 1)
conv.bias.data = bias
torch.relu(conv(inputs.view(1, 4, 1, 1)))
Out[23]:
tensor([[[[14.9000]],

         [[19.0000]]]], grad_fn=<ReluBackward0>)