This project is a sentiment classifier for IMDb movie reviews. It uses a pre-trained GloVe word embedding model and a Bidirectional LSTM network to classify reviews as positive or negative.
- Loads IMDb movie reviews for training, validation, and testing.
- Uses GloVe embeddings for enhanced text representation.
- Trains a Bidirectional LSTM (Long Short-Term Memory) model to classify reviews as positive or negative.
- Achieves high accuracy on both validation and test sets.
-
Clone this repository:
git clone https://github.com/sminerport/IMDbSentimentClassifier.git cd IMDbSentimentClassifier -
Install dependencies:
pip install -r requirements.txt
-
Run the model:
python src/main.py
The model uses IMDb review data split into training, validation, and test sets. These files are stored in the data/ directory and are managed with Git Large File Storage (Git LFS) to optimize storage and download efficiency.
To ensure access to the data files, please install Git LFS if you haven’t already. You can download Git LFS here.
# Install Git LFS
git lfs installThen, clone the repository as usual:
git clone https://github.com/sminerport/IMDbSentimentClassifier.git
cd IMDbSentimentClassifierIf you’ve already cloned the repository without Git LFS, run the following command to pull the LFS files:
git lfs pullTo train the model:
python src/main.pyAfter running, the script will automatically download and clean up GloVe embeddings to save space.
Below is a snapshot of the model's training and validation accuracy and loss across epochs:
This image provides a visual summary of the training process. Each epoch displays the model's accuracy and loss on both the training and validation sets, showing the progression as the model improves over time.
The script will delete the GloVe embeddings and the saved model (best_model.keras) after evaluation to conserve storage. If you'd like to keep these files, set the cleanup variable to False in the script.
- To adjust storage usage, toggle the
cleanupvariable in the script. requirements.txtis generated by runningpip freeze > requirements.txtin a Colab environment or your local environment.
This project is licensed under the MIT License. See the LICENSE file for more details.
