This project implements a Long Short-Term Memory (LSTM) Recurrent Neural Network in Python to predict the number of international airline passengers. The dataset used is the International Airline Passengers dataset, which records monthly totals of international airline passengers (in thousands) from January 1949 to December 1960.
The goal is to predict the number of passengers for a given month and year, using historical data. The project employs TimeSeriesSplit for model optimization and evaluates the model's performance using Root Mean Squared Error (RMSE).
- Source: International Airline Passengers Dataset
- Timeframe: January 1949 to December 1960 (12 years, 144 observations)
- Target: Monthly total of international airline passengers (in units of 1,000)
- Model: Long Short-Term Memory (LSTM) Recurrent Neural Network
- Optimization: TimeSeriesSplit for hyperparameter tuning and performance evaluation
- Error Metrics: Root Mean Squared Error (RMSE) for train and test sets
- Train RMSE: 20.90
- Test RMSE: 46.01
.
├── data
│ └── airline_passengers.csv # Dataset file
├── models
│ └── model.pkl # Trained LSTM model
├── notebooks
│ └── time_series_analysis.ipynb # Jupyter Notebook for analysis
└── README.md # Project description
- Data Preparation: Loaded and preprocessed the time series data, normalized values, and created sequences for training and testing.
- Model Design: Designed an LSTM network to capture temporal dependencies in the data.
- Optimization: Used TimeSeriesSplit for hyperparameter tuning and model selection.
- Training: Trained the LSTM model on the dataset.
- Evaluation: Assessed the model's performance using RMSE on both train and test sets.
- Python 3.8 or higher
- Libraries: TensorFlow, NumPy, Pandas, Matplotlib, scikit-learn
- Clone this repository:
git clone https://github.com/SrujanBhirud/Airline-Passenger-Prediction-Using-LSTM.git
- Navigate to the project directory:
cd Airline-Passenger-Prediction-Using-LSTM
- Install dependencies:
pip install -r requirements.txt
- Run the notebook:
jupyter notebook notebooks/time_series_analysis.ipynb
- The LSTM model effectively captures temporal patterns but demonstrates room for improvement in predicting test data.
- RMSE on the test set indicates higher error due to unseen data, suggesting the need for more robust regularization or additional features.
- Experiment with additional architectures like GRU and Transformer models.
- Incorporate external features such as economic indicators or seasonality adjustments.
- Dataset provided by Jason Brownlee (Machine Learning Mastery).