Loading

Quipoin Menu

Learn • Practice • Grow

machine-learning / Hierarchical Clustering
tutorial

Hierarchical Clustering

Hierarchical clustering builds a tree of clusters (dendrogram) without pre‑specifying the number of clusters. It is useful for exploring data structure and for small datasets.

Hierarchical clustering creates a hierarchy of clusters that can be cut at any level to obtain a desired number of clusters.

Agglomerative (Bottom‑Up) Approach

1. Start with each point as its own cluster.
2. Find the two closest clusters and merge them.
3. Repeat until only one cluster remains.
4. The sequence of merges is shown in a dendrogram.

from sklearn.cluster import AgglomerativeClustering

cluster = AgglomerativeClustering(n_clusters=3)
labels = cluster.fit_predict(X)

Linkage Criteria

  • Single linkage: distance between closest points of two clusters (tends to create chains).
  • Complete linkage: distance between farthest points (tends to create compact clusters).
  • Average linkage: average distance between all pairs.
  • Ward linkage: minimizes variance within clusters (most popular).

Plotting a Dendrogram

Use `scipy` to visualize the hierarchical tree. The height of merges indicates dissimilarity.
from scipy.cluster.hierarchy import dendrogram, linkage

linked = linkage(X, method='ward')
dendrogram(linked)
plt.show()

When to Use Hierarchical Clustering

Small datasets (few thousand points) where you want a dendrogram for insight. Not suitable for very large datasets due to O(n³) complexity.


Two Minute Drill
  • Hierarchical clustering builds a dendrogram.
  • Agglomerative clustering starts with single points and merges.
  • Linkage criteria (Ward, complete, single) affect cluster shapes.
  • Good for exploring structure in small datasets.

Need more clarification?

Drop us an email at career@quipoinfotech.com