Skip to content

A machine learning project to classify SMS messages as **Spam** or **Ham** using Natural Language Processing (NLP) and classification algorithms.

Notifications You must be signed in to change notification settings

sabih-haider1/SMS-Spam-Detector

Repository files navigation

📩 SMS Spam Detector

A machine learning project that detects whether a given SMS message is Ham (legit) or Spam using a 54K+ message dataset collected from multiple research sources. Built with Python, Scikit-learn, and Flask, this project demonstrates how to preprocess text, train a model, and expose predictions through a simple web interface.

🚀 Features

Dataset: 54,000+ labeled SMS messages (Ham/Spam) combined from multiple open-source research corpora.

Machine Learning model: Naive Bayes (scikit-learn).

Text preprocessing with CountVectorizer.

Achieves high accuracy (~98–99%) on test data.

Flask-powered web app with clean HTML interface.

Predict whether a custom SMS is Spam or Ham instantly.

📊 Dataset

Ham messages: ~48,000

Spam messages: ~6,000

Sources: Combined from multiple open datasets used in SMS spam research (e.g. UCI SMS Spam Collection v1, Kaggle repositories, academic spam corpora).

Example format:

ham What you doing? how are you?
spam FreeMsg: Call this number to claim your reward!

⚙️ Installation & Usage

  1. Clone repo git clone cd sms-spam-detector

  2. Create environment & install dependencies python3 -m venv venv source venv/bin/activate # Mac/Linux
    venv\Scripts\activate # Windows

pip install -r requirements.txt

  1. Train model python3 train.py

This creates spam_model.pkl (trained Naive Bayes model).

  1. Run web app python3 app.py

Then open browser → http://127.0.0.1:5000

🖼 Screenshots

Screenshot 201 Screenshot 200

🔮 Future Improvements

Try other models (Logistic Regression, SVM, Deep Learning).

Add interactive charts for dataset statistics.

Deploy to Heroku/Render for live demo link.

Build a React frontend to interact with the Flask API.

📜 License

free to use and modify.

About

A machine learning project to classify SMS messages as **Spam** or **Ham** using Natural Language Processing (NLP) and classification algorithms.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published