Skip to content

nikhil-reddy05/Fake-News-Detection

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Fake News Detection


πŸ” Problem Statement

Fake news is a major issue in digital journalism and social media. The goal of this project is to automatically classify news articles as "Fake" or "Real" using Natural Language Processing (NLP) and Machine Learning/Deep Learning models.


πŸ“Š Dataset


βš™οΈ Features Implemented

βœ… Data Preprocessing & EDA

  • Merged and labeled True.csv and Fake.csv
  • Cleaned HTML, URLs, punctuation, stopwords
  • Analyzed text length, word count, NER entities
  • Visualized class balance, word clouds, and NER distribution

βœ… Text Representation

  • TF-IDF Vectorization
  • Word2Vec Embeddings
  • GloVe 100d Pretrained Embeddings (for Deep Learning)

βœ… Machine Learning Models

  • Logistic Regression (with GridSearchCV)
  • NaΓ―ve Bayes
  • Support Vector Machine (SVM)
  • Random Forest

βœ… Deep Learning Model

  • Bidirectional LSTM with:
    • Pretrained GloVe embeddings (100d)
    • Two stacked BiLSTM layers
    • Dropout & L2 regularization
    • EarlyStopping and ReduceLROnPlateau

βœ… Evaluation

  • Accuracy, Precision, Recall, F1-Score
  • Confusion Matrix
  • Cross-validation (for classical models)

🧠 Model Performance Summary

Model Accuracy
Logistic Regression 98.7%
NaΓ―ve Bayes 93.3%
SVM 99.4%
Random Forest 99.8%
BiLSTM (GloVe) 99.9% βœ…

⚠️ Note: Models trained on padded & tokenized text with proper validation strategy.


πŸ§ͺ Tools & Technologies

  • Languages: Python
  • Libraries: NumPy, pandas, scikit-learn, TensorFlow/Keras, NLTK, Matplotlib, Seaborn
  • Embeddings: GloVe (100d), Word2Vec
  • EDA Tools: SpaCy, WordCloud
  • Version Control: Git, GitHub

πŸš€ Deployment (Optional)

Can be deployed via:

  • Flask API for local predictions
  • Streamlit web app for UI

πŸ’‘ Future Improvements

  • Integrate BERT or DistilBERT for transformer-based classification
  • Add SHAP/LIME for explainability
  • Self-learning (active learning loop)
  • Streamlit UI for production-level deployment
  • Deploy on Render / HuggingFace Spaces / GCP

🀝 Acknowledgements


πŸ‘¨β€πŸ’» Author

Nikhil Reddy Banda
GitHub | LinkedIn

About

A Machine Learning project for detecting fake news using NLP and Transformers.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published