Email Spam and Phishing URL Detection

title	emoji	colorFrom	colorTo	sdk	sdk_version	app_file
Spam Email Detection	💌	pink	blue	gradio	3.17.0	app.py

Email Spam and Phishing URL Detection

This project utilizes Naive Bayes classification to detect whether an email is spam or not, and XGBoost classification to determine if a URL within an email is phishing or legitimate.

Getting Started

Project Overview

The project consists of two main components:

Email Spam Detection: This component employs Naive Bayes classification to classify emails as either spam or not spam based on their content features.
Phishing URL Detection: This component uses XGBoost classification to identify whether URLs within emails are associated with phishing attempts or legitimate websites.

Prerequisites

Make sure you have Python 3.10 installed on your system. You can download it from

Requirements

Ensure you have the following dependencies installed. You can install them using pip install -r requirements.txt.

gunicorn==22.0.0
python-dateutil==2.8.2
gradio==4.32.1
gradio_client==0.17.0
requests==2.31.0
beautifulsoup4==4.12.3
googlesearch_python==1.2.4
urlextract==1.9.0
numpy==1.26.3
pandas==2.2.0
scikit-learn==1.5.0
urllib3==2.1.0
python-whois==0.9.4
xgboost==2.0.3
lxml==5.2.2

Setup and Installation

Clone the repository:

git clone https://github.com/your-username/email-spam-phishing-detection.git
cd email-spam-phishing-detection

Install dependencies:
```
pip install -r requirements.txt```
```

Usage

Data Preparation:
- Ensure the datasets spam.csv and urldata.csv are available in the data/ directory.
Model Training:
- If necessary, modify and run the notebook.ipynb Jupyter notebook to train or fine-tune the machine learning models.
- Trained models will be saved in the models/ directory.
Run the Application:
- Execute app.py to start the application.
- Access the application at Hugging Face Space

Acknowledgements

The email spam classification model is trained using the spam.csv dataset, sourced from Dataset: Spam/ham mail).
The URL phishing detection model is trained using the urldata.csv dataset, sourced from Phishing Websites Dataset.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.github/workflows		.github/workflows
data		data
models		models
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
URLFeatureExtraction.py		URLFeatureExtraction.py
app.py		app.py
notebook.ipynb		notebook.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Email Spam and Phishing URL Detection

Getting Started

Project Overview

Prerequisites

Requirements

Setup and Installation

Usage

Acknowledgements

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

iiakshat/spam-mail-detection

Folders and files

Latest commit

History

Repository files navigation

Email Spam and Phishing URL Detection

Getting Started

Project Overview

Prerequisites

Requirements

Setup and Installation

Usage

Acknowledgements

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages