Loading

Quipoin Menu

Learn • Practice • Grow

machine-learning / Clustering with K-Means
tutorial

Clustering with K-Means

Clustering is an unsupervised learning task where we group similar data points together without any labels. K‑means is the most popular clustering algorithm.

K‑means partitions data into k clusters, each represented by its centroid (average point).

How K‑means Works

1. Choose k (number of clusters).
2. Randomly initialize k centroids.
3. Assign each point to the nearest centroid.
4. Update centroids as mean of assigned points.
5. Repeat steps 3‑4 until convergence (centroids stop moving).

from sklearn.cluster import KMeans

kmeans = KMeans(n_clusters=3, random_state=42)
kmeans.fit(X)
labels = kmeans.labels_
centroids = kmeans.cluster_centers_

Choosing k – The Elbow Method

Compute inertia (sum of squared distances to nearest centroid) for different k. Plot inertia vs k – look for an "elbow" where inertia stops decreasing rapidly.
inertias = []
for k in range(1,11):
kmeans = KMeans(n_clusters=k)
kmeans.fit(X)
inertias.append(kmeans.inertia_)
plt.plot(range(1,11), inertias, marker='o')

When to Use K‑means

Works well when clusters are spherical and well‑separated. Not good for irregular shapes or different densities. Scale features before using K‑means.


Two Minute Drill
  • K‑means groups data into k clusters.
  • Use elbow method to choose k.
  • Scale features before clustering.
  • Works best for spherical clusters.

Need more clarification?

Drop us an email at career@quipoinfotech.com