Convolution Operation
Convolution is the core operation of CNNs. It slides a small filter (kernel) over the input, computing dot products at each position. This produces a feature map that highlights patterns like edges, textures, or shapes.
How Convolution Works
Take a 3×3 kernel and a 5×5 image. Place the kernel at the top‑left, multiply element‑wise with the image patch, sum the results. Slide the kernel across (stride = 1), repeat. The output is a 3×3 feature map.
Input (5x5) Kernel (3x3) Output (3x3)
1 2 3 4 5 1 0 1 [ ... ]
6 7 8 9 1 0 1 0
... 1 0 1Kernels (Filters)
Kernels are learned during training. Early layers learn simple patterns (edges, corners), deeper layers learn complex structures (eyes, wheels, faces). Multiple kernels per layer produce multiple feature maps.
Stride and Padding
- Stride: Number of pixels the kernel moves each step. Larger stride reduces output size.
- Padding: Adding zeros around the input to preserve spatial dimensions. 'Same' padding keeps output size equal to input size.
Number of Parameters
If a layer has 32 kernels of size 3×3 with 3 input channels (RGB), parameters = 32 × (3×3×3) + 32 biases = 896 – tiny compared to fully connected.
Two Minute Drill
- Convolution: slide filter, dot product, produce feature map.
- Kernels learn patterns; multiple kernels per layer.
- Stride controls step size; padding controls output size.
- Parameter sharing makes CNNs efficient.
Need more clarification?
Drop us an email at career@quipoinfotech.com
