Classification Project

In this end‑to‑end project, you will build a spam detection system using logistic regression. You will handle text data, vectorize messages, train a classifier, and evaluate performance.

Project: Classify SMS messages as spam or ham (not spam) using logistic regression.

Step 1: Load Data

We will use a public SMS spam collection dataset.

import pandas as pd

url = 'https://raw.githubusercontent.com/justmarkham/pycon-2016-tutorial/master/data/sms.tsv'
df = pd.read_csv(url, sep='t', header=None, names=['label', 'message'])
print(df.head())
print(df['label'].value_counts())

Step 2: Preprocessing – Convert Text to Numbers

Use TfidfVectorizer to convert messages into numerical features.

from sklearn.feature_extraction.text import TfidfVectorizer

vectorizer = TfidfVectorizer(stop_words='english', max_features=3000)
X = vectorizer.fit_transform(df['message'])
y = df['label'].map({'ham':0, 'spam':1})

Step 3: Train/Test Split

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

Step 4: Train Logistic Regression

from sklearn.linear_model import LogisticRegression

model = LogisticRegression(max_iter=1000)
model.fit(X_train, y_train)

Step 5: Evaluate

from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix

y_pred = model.predict(X_test)
print(f"Accuracy: {accuracy_score(y_test, y_pred):.3f}")
print(f"Precision: {precision_score(y_test, y_pred):.3f}")
print(f"Recall: {recall_score(y_test, y_pred):.3f}")
print(f"F1: {f1_score(y_test, y_pred):.3f}")
print(confusion_matrix(y_test, y_pred))

Step 6: Save Model and Vectorizer

import joblib

joblib.dump(model, 'spam_model.joblib')
joblib.dump(vectorizer, 'vectorizer.joblib')

Two Minute Drill

Load SMS data, convert text to TF‑IDF features.
Train logistic regression classifier.
Evaluate using accuracy, precision, recall, F1.
Save model and vectorizer for inference.

Need more clarification?

Drop us an email at career@quipoinfotech.com

Welcome to Quipoin

Quipoin Menu

Classification Project

Need more clarification?