Loading

Quipoin Menu

Learn • Practice • Grow

python-for-ai / Data Filtering
interview

Q1. From a DataFrame of employees with columns ''department'', ''salary'', select all rows where department is ''IT'' and salary > 70000.
filtered = df[(df['department'] == 'IT') & (df['salary'] > 70000)]
Use & for AND, | for OR. Parentheses required. This is boolean indexing. Use .query('department == "IT" and salary > 70000') as alternative.

Q2. Filter rows where the ''product'' column contains the substring ''phone'' using str.contains.
mask = df['product'].str.contains('phone', na=False)
filtered = df[mask]
na=False ignores NaN. Case‑insensitive: case=False. This is used for text pattern matching.

Q3. Select rows 10 through 20 (inclusive) using .iloc and also rows with index labels ''row5'' to ''row10'' using .loc.
df.iloc[10:21]                     # iloc uses integer positions (end exclusive)
df.loc['row5':'row10']              # loc includes both labels
Essential for subset selection.

Q4. From a DataFrame with columns A, B, C select only columns A and C. Also select all columns except B.
df[['A', 'C']]                     # select A and C
df.drop('B', axis=1)                # drop B
df.loc[:, df.columns != 'B']        # alternative exclude B
Both return new DataFrames. Use inplace=True to modify original.

Q5. Filter rows where the ''score'' column is between 60 and 80 inclusive. Use between() method.
filtered = df[df['score'].between(60, 80, inclusive='both')]
This is concise and efficient. Also df[(df.score >= 60) & (df.score <= 80)].