🩺 Breast Cancer Classification with Machine Learning

🔹 Overview

This project demonstrates the classification of breast cancer using multiple machine learning algorithms. The main goal is to predict whether a tumor is malignant or benign based on features extracted from the Breast Cancer Wisconsin (Diagnostic) Dataset. The project also compares the performance of different models to identify the most effective approach.

📂 Project Structure

Data Preprocessing: Includes train/test splitting, scaling with MinMaxScaler, and Check for missing values.
Multiple Classification Models: Implements popular algorithms for robust comparison:
- Gaussian Naive Bayes
- K-Nearest Neighbors (KNN)
- Decision Tree
- Random Forest
- Support Vector Machine (SVM)
- Logistic Regression
- Artificial Neural Network (ANN)
Performance Evaluation: Computes comprehensive metrics including:
- Accuracy (Train/Test)
- Precision
- Recall
- F1-Score
- ROC Curve & AUC
Visualization: Clear bar charts and ROC curves for model comparison.
Reproducibility: Well-structured and documented code for easy understanding and reuse.

🗂 Dataset

The project uses the Breast Cancer Wisconsin (Diagnostic) Dataset from scikit-learn:

Instances: 569
Features: 30 numeric features describing characteristics of cell nuclei from FNA (Fine Needle Aspirate) images.
Target: Binary classification
- Malignant (0)
- Benign (1)

🛠 Algorithms

Algorithm	Description
Gaussian Naive Bayes (GNB)	Probabilistic classifier based on Bayes' theorem assuming Gaussian distribution of features.
K-Nearest Neighbors (KNN)	Classifies data based on the majority class among the k nearest neighbors.
Decision Tree (DT)	Splits data recursively to make classification decisions based on feature values.
Random Forest (RF)	Ensemble of decision trees to reduce overfitting and improve accuracy.
Support Vector Machine (SVM)	Finds the optimal hyperplane that maximizes margin between classes.
Logistic Regression (LR)	Linear model predicting the probability of a binary outcome.
Artificial Neural Network (ANN)	Multi-layer perceptron capable of learning complex non-linear pa

💻 Installation

Ensure you have Python 3.7+ installed. Then install required dependencies:

pip install scikit-learn matplotlib seaborn pandas

🚀 Usage

1. Clone the repository:

git clone <your-repo-url>

cd <repo-folder>

2. Run the Jupyter Notebook:

jupyter notebook breast_cancer.ipynb

3. Follow the notebook to:

Preprocess the dataset
Train multiple models
Evaluate metrics (accuracy, precision, recall, F1-score)
Visualize model performance (bar charts and ROC curves)

📊 Results

The notebook generates visual comparisons of all models on train and test sets.
ROC curves and AUC values are plotted for each model.
Provides a clear understanding of the best-performing algorithm for this dataset.

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Image		Image
Plots		Plots
Breast_Cancer_Classification.ipynb		Breast_Cancer_Classification.ipynb
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🩺 Breast Cancer Classification with Machine Learning

🔹 Overview

📂 Project Structure

🗂 Dataset

🛠 Algorithms

💻 Installation

🚀 Usage

1. Clone the repository:

2. Run the Jupyter Notebook:

3. Follow the notebook to:

📊 Results

📜 License

About

Uh oh!

Releases

Packages

Languages

License

ZahraSahranavard/Breast-Cancer-Classification-ML

Folders and files

Latest commit

History

Repository files navigation

🩺 Breast Cancer Classification with Machine Learning

🔹 Overview

📂 Project Structure

🗂 Dataset

🛠 Algorithms

💻 Installation

🚀 Usage

1. Clone the repository:

2. Run the Jupyter Notebook:

3. Follow the notebook to:

📊 Results

📜 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages