GitHub - sabih-haider1/SMS-Spam-Detector: A machine learning project to classify SMS messages as **Spam** or **Ham** using Natural Language Processing (NLP) and classification algorithms.

📩 SMS Spam Detector

A machine learning project that detects whether a given SMS message is Ham (legit) or Spam using a 54K+ message dataset collected from multiple research sources. Built with Python, Scikit-learn, and Flask, this project demonstrates how to preprocess text, train a model, and expose predictions through a simple web interface.

🚀 Features

Dataset: 54,000+ labeled SMS messages (Ham/Spam) combined from multiple open-source research corpora.

Machine Learning model: Naive Bayes (scikit-learn).

Text preprocessing with CountVectorizer.

Achieves high accuracy (~98–99%) on test data.

Flask-powered web app with clean HTML interface.

Predict whether a custom SMS is Spam or Ham instantly.

📊 Dataset

Ham messages: ~48,000

Spam messages: ~6,000

Sources: Combined from multiple open datasets used in SMS spam research (e.g. UCI SMS Spam Collection v1, Kaggle repositories, academic spam corpora).

Example format:

ham What you doing? how are you?
spam FreeMsg: Call this number to claim your reward!

⚙️ Installation & Usage

Clone repo git clone cd sms-spam-detector
Create environment & install dependencies python3 -m venv venv source venv/bin/activate # Mac/Linux
venv\Scripts\activate # Windows

pip install -r requirements.txt

Train model python3 train.py

This creates spam_model.pkl (trained Naive Bayes model).

Run web app python3 app.py

Then open browser → http://127.0.0.1:5000

🖼 Screenshots

🔮 Future Improvements

Try other models (Logistic Regression, SVM, Deep Learning).

Add interactive charts for dataset statistics.

Deploy to Heroku/Render for live demo link.

Build a React frontend to interact with the Flask API.

📜 License

free to use and modify.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
templates		templates
.DS_Store		.DS_Store
Dataset_1.csv		Dataset_1.csv
Dataset_2.txt		Dataset_2.txt
README.md		README.md
SMSSpamCollection		SMSSpamCollection
Train.py		Train.py
app.py		app.py
emails.csv		emails.csv
enron_spam_data.csv		enron_spam_data.csv
merge.py		merge.py
merged_spam_dataset.csv		merged_spam_dataset.csv
requirements.txt		requirements.txt
spam_model.pkl		spam_model.pkl
task.txt		task.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

sabih-haider1/SMS-Spam-Detector

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages