Data Filtering Interview Questions

Q1. From a DataFrame of employees with columns ''department'', ''salary'', select all rows where department is ''IT'' and salary > 70000.

filtered = df[(df['department'] == 'IT') & (df['salary'] > 70000)]

Use & for AND, | for OR. Parentheses required. This is boolean indexing. Use .query('department == "IT" and salary > 70000') as alternative.

Q2. Filter rows where the ''product'' column contains the substring ''phone'' using str.contains.

mask = df['product'].str.contains('phone', na=False)
filtered = df[mask]

na=False ignores NaN. Case‑insensitive: case=False. This is used for text pattern matching.

Q3. Select rows 10 through 20 (inclusive) using .iloc and also rows with index labels ''row5'' to ''row10'' using .loc.

df.iloc[10:21]                     # iloc uses integer positions (end exclusive)
df.loc['row5':'row10']              # loc includes both labels

Essential for subset selection.

Q4. From a DataFrame with columns A, B, C select only columns A and C. Also select all columns except B.

df[['A', 'C']]                     # select A and C
df.drop('B', axis=1)                # drop B
df.loc[:, df.columns != 'B']        # alternative exclude B

Both return new DataFrames. Use inplace=True to modify original.

Q5. Filter rows where the ''score'' column is between 60 and 80 inclusive. Use between() method.

filtered = df[df['score'].between(60, 80, inclusive='both')]

This is concise and efficient. Also df[(df.score >= 60) & (df.score <= 80)].

Welcome to Quipoin

Quipoin Menu