This could be achieved by setting the ‘size‘ argument to (2, 3). Out[4]: torch.Size([2, 8, 64, 64]) And it does! In my example, the minimum output size is 22x22, and output_padding provides a number of lines (between 0 and 6) to add at the bottom of the output image and a number of columns (between 0 and 6) to add at the right of the output image. shape # should be same as x.shape. The result of applying this operation to a 2×2 image would be a 4×6 output … Can be a single integer to specify the same value for all spatial dimensions. This is defined by the ‘size‘ argument that is set to the tuple (2,2). Integer, the dimensionality of the output space (i.e. the number of output filters in the convolution). kernel_size: An integer or list of 2 integers, specifying the width and height of the 2D convolution window. These parameters are filter size, stride and zero padding. But when thinking about transposed convolutions from a distribution perspective, we stride over the output, which increases the size of the output. In a convolutional neural network, there are 3 main parameters that need to be tweaked to modify the behavior of a convolutional layer. Notice that the weights of this convolution transpose layer are all random, and are unrelated to the weights of the original Conv2d. In the final ‘ConvTranspose2d’ we will be outputting 3 filters as the output image of the generator is going to be a 3 channel(RGB) and we apply a ‘Tanh’ rectification to break the linearity and stay between … So, the layer convt is not the mathematical inverse of the layer conv. We can matrix-multiply C.T (16x4) with a column vector (4x1) to generate an output matrix (16x1). ConvTranspose2d (in_channels = 8, out_channels = 8, kernel_size = 5) convt (y). The transposed matrix connects 1 value to 9 values in the output. Now let jump to our layer1 which consists of two conv2d layers followed by ReLU activation function and BatchNormalization.self.layer1 takes 3 channels as an input and gives out 32 channels as output.. Pre-trained models and datasets built by Google and the community In my previous article, I have explained why we import nn.Module and use super method. You may want to use different factors on each dimension, such as double the width and triple the height. Similarly self.layer2 takes 32 channel as input and give out 128 channel as ouput.