Loading

Quipoin Menu

Learn • Practice • Grow

python-for-ai / Histograms and Scatter Plots
tutorial

Histograms and Scatter Plots

Histograms show the distribution of a single variable (e.g., age of customers). Scatter plots reveal relationships between two variables (e.g., height vs. weight). Both are critical for understanding data before modeling.

Histogram

import matplotlib.pyplot as plt
import numpy as np

# Generate random data
data = np.random.randn(1000)

plt.hist(data, bins=30, edgecolor='black')
plt.title('Histogram of Normally Distributed Data')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.show()

Scatter Plot

x = np.random.rand(50) * 10
y = 2 * x + 1 + np.random.randn(50) * 2 # linear with noise

plt.scatter(x, y, alpha=0.7)
plt.title('Scatter Plot with Linear Trend')
plt.xlabel('X')
plt.ylabel('Y')
plt.show()

Why These Plots Matter for AI

  • Histogram: Check if data is normally distributed (assumption for some models). Detect skewness or outliers.
  • Scatter plot: Visualize correlation between features and target. Identify non‑linear relationships.

Customizing Histogram Bins

plt.hist(data, bins=50, density=True, alpha=0.6, color='g')

Scatter Plot with Color Mapping

colors = np.random.rand(50)
plt.scatter(x, y, c=colors, cmap='viridis')
plt.colorbar()


Two Minute Drill
  • Histogram: plt.hist(data, bins=N) – shows distribution.
  • Scatter plot: plt.scatter(x, y) – shows relationship.
  • Use for EDA (Exploratory Data Analysis) before modeling.
  • Customize bins, colors, transparency for clarity.

Need more clarification?

Drop us an email at career@quipoinfotech.com