Q1. What is pandas and what are its main data structures?
pandas is a powerful data manipulation and analysis library. Its main data structures are Series (1D labeled array) and DataFrame (2D labeled table). It provides tools for reading/writing data, handling missing values, grouping, merging, and more.
Q2. How do you create a DataFrame?
From a dictionary, list, CSV file, etc. Example:
import pandas as pd
df = pd.DataFrame({''name'': [''Alice'', ''Bob''], ''age'': [25, 30]})
df = pd.read_csv(''data.csv'')Q3. How do you select data from a DataFrame?
Use df[''column''] to select a column, df.loc[] for label-based indexing, df.iloc[] for integer position indexing. Boolean indexing: df[df[''age''] > 25]. Example:
df.loc[0:2, ''name''] # rows 0-2, column name
df.iloc[:, 1:3] # all rows, columns 1-2Q4. How do you handle missing data in pandas?
Use isnull() to detect missing values, dropna() to remove rows/columns with missing values, fillna() to fill with a value or method (e.g., ffill). Example:
df.dropna()
df.fillna(0)
df.fillna(method=''ffill'')Q5. What is groupby in pandas?
Groupby splits data into groups based on a key, then applies aggregation functions. Example:
df.groupby(''category'')[''value''].mean() You can also use multiple aggregations or custom functions.