Why CNN?
Fully connected layers treat each pixel as an independent feature. For a 256×256 image, that's 65,536 inputs – and adding a hidden layer of 1,000 neurons gives over 65 million parameters. This leads to overfitting, huge memory, and loss of spatial structure. Convolutional Neural Networks (CNNs) solve this by exploiting spatial locality.
CNNs use convolution, pooling, and shared weights to dramatically reduce parameters while preserving spatial information.
Limitations of Fully Connected Layers for Images
- No translation invariance: shifting an object changes all input neurons.
- Too many parameters: easy to overfit.
- Ignores local structure: neighboring pixels are highly correlated.
How CNNs Address These
- Local connectivity: neurons only connect to a small region (receptive field).
- Shared weights: same filter slides across the whole image.
- Hierarchical feature learning: edges → shapes → objects.
Analogy: Detecting Edges
A small filter (e.g., 3×3) slides over the image, looking for vertical edges. Wherever it finds a vertical edge, it activates strongly. This same filter works anywhere – so the object can shift without breaking detection.
Two Minute Drill
- Fully connected layers scale poorly for images.
- CNNs use local connectivity and shared weights.
- They learn hierarchical features: edges → parts → objects.
- CNNs are translation‑invariant.
Need more clarification?
Drop us an email at career@quipoinfotech.com
