Q1. You are a data scientist analyzing customer data for an e-commerce site. You have 1000 customers and 50 features (age, purchase history, etc.). How would you represent this data in a structured way?
This data is naturally represented as a matrix with 1000 rows (customers) and 50 columns (features).
Each entry is the value of a feature for a customer.
Matrices allow you to perform operations like scaling, transforming, or applying machine learning algorithms (e.g., linear regression: y = X·w).
Libraries like NumPy and pandas store data as matrices for efficient computation.
Each entry is the value of a feature for a customer.
Matrices allow you to perform operations like scaling, transforming, or applying machine learning algorithms (e.g., linear regression: y = X·w).
Libraries like NumPy and pandas store data as matrices for efficient computation.
Q2. In image processing, a grayscale image is a matrix of pixel intensities (0-255). If you have a 100x100 image and you want to blur it using a 3x3 averaging kernel, how would you apply this kernel using matrix operations?
Convolution: Slide the kernel over the image matrix.
For each position, multiply the kernel element-wise with the overlapping image patch and sum.
This yields a new matrix of the same size (with padding).
Matrix convolution is the basis of edge detection, blurring, and feature extraction in neural networks (CNNs).
For each position, multiply the kernel element-wise with the overlapping image patch and sum.
This yields a new matrix of the same size (with padding).
Matrix convolution is the basis of edge detection, blurring, and feature extraction in neural networks (CNNs).
Q3. A social network has users as rows and columns, and matrix entry (i,j) = 1 if user i follows user j. How would you find the number of mutual follows between users? What operation gives the number of two-step connections?
The matrix is symmetric for mutual follows (M[i,j] = M[j,i] = 1).
Count mutual follows by summing the lower triangle of the matrix.
For two-step connections (follows of follows), multiply the matrix by itself:
M2[i,j] counts the number of paths of length 2 from i to j.
This is a key concept in network analysis (e.g., PageRank algorithm).
Count mutual follows by summing the lower triangle of the matrix.
For two-step connections (follows of follows), multiply the matrix by itself:
M2[i,j] counts the number of paths of length 2 from i to j.
This is a key concept in network analysis (e.g., PageRank algorithm).
Q4. In a recommendation system, you have a user-item matrix where entries are ratings. This matrix is very sparse (most entries empty). How would you find similar users without filling all empty cells?
Use matrix factorization: decompose the matrix into two smaller matrices U (user factors) and V (item factors) such that U·VT approximates the rating matrix.
Then user similarity can be computed from U.
This is the basis of collaborative filtering algorithms used by Netflix and Amazon.
Then user similarity can be computed from U.
This is the basis of collaborative filtering algorithms used by Netflix and Amazon.
Q5. You have a matrix of pixel values for a batch of 100 grayscale images, each 28x28 pixels. What is the shape of this matrix if you flatten each image into a vector? How would you store a batch for neural network input?
Flatten each 28×28 image into a 1D vector of length 784.
The batch matrix becomes 100 rows (images) × 784 columns (pixels).
This is the standard input format for fully connected neural networks.
Then you can multiply by weight matrices to compute predictions.
The batch matrix becomes 100 rows (images) × 784 columns (pixels).
This is the standard input format for fully connected neural networks.
Then you can multiply by weight matrices to compute predictions.
