Data Filtering
Often you need to extract a subset of data based on conditions – for example, all rows where age > 30, or only certain columns. Pandas makes filtering intuitive.
Selecting Columns
# Single column -> Series
ages = df['Age']
# Multiple columns -> DataFrame
subset = df[['Name', 'Score']]Filtering Rows by Condition
# Rows where Age > 30
adults = df[df['Age'] > 30]
# Multiple conditions (AND)
filtered = df[(df['Age'] > 25) & (df['Score'] > 85)]
# OR condition
filtered = df[(df['Age'] < 20) | (df['Score'] > 90)]Using
isin() for Categorical Filteringdf[df['City'].isin(['New York', 'London'])]Selecting Rows by Position with
iloc# First 3 rows
df.iloc[:3]
# Rows 2 to 4, columns 0 to 2
df.iloc[2:5, 0:3]Selecting by Label with
loc# Rows with index label 0,2 and columns 'Name', 'Score'
df.loc[[0,2], ['Name', 'Score']]Why Filtering Matters for AI
You often need to split data by class (e.g., all spam emails), remove outliers, or select specific features before training.
Two Minute Drill
- Select columns:
df['col']ordf[['col1','col2']]. - Filter rows:
df[df['col'] > value]. - Combine conditions with
&(AND) and|(OR). ilocfor position‑based,locfor label‑based selection.
Need more clarification?
Drop us an email at career@quipoinfotech.com
