Loading

Quipoin Menu

Learn • Practice • Grow

python-for-ai / Merging and Joining
interview

Q1. Scenario: You have two DataFrames: df1 (customer_id, name) and df2 (customer_id, purchase_amount). Merge them to get customer names with purchases.
merged = pd.merge(df1, df2, on=''customer_id'', how=''inner''). inner join keeps only matching keys. Other how: left, right, outer. This is SQL-like join. Also use .join() or .merge().

Q2. Scenario: Concatenate two DataFrames with the same columns vertically (stack rows) using pd.concat.
combined = pd.concat([df1, df2], axis=0, ignore_index=True). axis=0 for rows. ignore_index resets index. Also use df1.append(df2) but deprecated. For columns, axis=1.

Q3. Scenario: Left join df1 (products) with df2 (sales) on product_id, keep all products even if no sales. Fill missing sales with 0.
merged = pd.merge(df1, df2, on=''product_id'', how=''left''); merged[''sales''] = merged[''sales''].fillna(0). Left join preserves left DataFrame rows.

Q4. Scenario: Merge two DataFrames on multiple keys: (year, month).
merged = pd.merge(df1, df2, on=[''year'',''month''], how=''inner''). This matches rows where both year and month are equal. Used for time series alignment.

Q5. Scenario: Perform an outer join between df1 and df2 and indicator which source each row came from using the indicator parameter.
merged = pd.merge(df1, df2, on=''id'', how=''outer'', indicator=True). The ''_merge'' column says ''both'', ''left_only'', or ''right_only''. Useful for debugging mismatches.