Loading

Quipoin Menu

Learn • Practice • Grow

Pandas

Pandas is a high‑level data manipulation library built on NumPy. It provides two main data structures: `Series` (1‑D) and `DataFrame` (2‑D), making it easy to work with structured data (like CSV files, databases).

Installation
`pip install pandas`

Creating a DataFrame
From a dictionary or reading a CSV.


import pandas as pd

data = {
"Name": ["Alice", "Bob", "Charlie"],
"Age": [25, 30, 35],
"City": ["NYC", "LA", "Chicago"]
}
df = pd.DataFrame(data)
print(df)

# Read from CSV
# df = pd.read_csv("file.csv")

Inspecting Data
`df.head()`, `df.info()`, `df.describe()`.

Selecting Data
Use column names or boolean indexing.


ages = df["Age"]
young = df[df["Age"] < 30]
subset = df.loc[0:1, ["Name", "City"]]

Data Cleaning
Handle missing values, rename columns, drop duplicates.


df.dropna(inplace=True)
df.fillna(0, inplace=True)
df.rename(columns={"Name": "FullName"}, inplace=True)

Grouping and Aggregation
`groupby` is powerful for summarization.


grouped = df.groupby("City")["Age"].mean()
Two Minute Drill
  • Pandas provides DataFrames for tabular data.
  • Read data from CSV, Excel, databases.
  • Select rows/columns with `[]`, `loc`, `iloc`.
  • Handle missing data with `dropna`, `fillna`.
  • Use `groupby` for aggregation, `pivot_table` for summaries.

Need more clarification?

Drop us an email at career@quipoinfotech.com