Loading

Quipoin Menu

Learn • Practice • Grow

python-for-ai / Grouping and Aggregation
tutorial

Grouping and Aggregation

Often you need to group data by a category and then compute statistics for each group – for example, average score per class, total sales per region. Pandas groupby makes this easy.

Basic GroupBy

import pandas as pd

df = pd.DataFrame({
'Department': ['HR', 'IT', 'IT', 'HR', 'Finance'],
'Salary': [50000, 70000, 80000, 55000, 90000]
})

# Group by Department and compute mean salary
grouped = df.groupby('Department')['Salary'].mean()
print(grouped)
Output:
Department
Finance 90000
HR 52500
IT 75000

Multiple Aggregations

result = df.groupby('Department')['Salary'].agg(['mean', 'median', 'count'])

Grouping by Multiple Columns

df = pd.DataFrame({
'City': ['NY', 'NY', 'LA', 'LA'],
'Year': [2020, 2021, 2020, 2021],
'Sales': [100, 150, 200, 250]
})
grouped = df.groupby(['City', 'Year'])['Sales'].sum()

Custom Aggregation Functions

def range_func(x):
return x.max() - x.min()

df.groupby('Department')['Salary'].agg(range_func)

Why GroupBy Matters for AI

You might need to:
  • Compute class‑wise statistics in a dataset (e.g., average pixel per digit in MNIST).
  • Aggregate user behavior for recommendation systems.
  • Prepare summary tables for visualization.


Two Minute Drill
  • df.groupby('column')['value'].mean() – group and aggregate.
  • Use .agg() for multiple functions.
  • Group by multiple columns with a list.
  • Custom functions can be passed to .agg().

Need more clarification?

Drop us an email at career@quipoinfotech.com