This repository contains two Jupyter Notebook implementations showcasing regression analysis using:
-
Linear Regression
-
XGBoost Regression
The notebooks analyze the Ecommerce Customers
dataset, performing predictive modeling to determine customer behavior and associated trends.
-
linear_reg.ipynb
: Demonstrates regression using the LinearRegression model from Scikit-Learn. -
xgboost_reg.ipynb
: Implements regression using the XGBoost library’s XGBRegressor.
Before running the notebooks, ensure you have the following installed:
- Python 3.8+
- Jupyter Notebook
- Required Python ilbraries:
- pandas
- numpy
- seaborn
- sci-kit
- xgboost
You can install the necessary libraries using:
pip install pandas numpy seaborn scikit-learn xgboost
Both notebooks use the Ecommerce Customers
dataset. Ensure the dataset is available in the working directory before running the notebooks.
Dataset name: Ecommerce Customers
Columns include:
- Address
- Avatar
- Avg. Session Length
- Time on App
- Time on Website
- Length of Membership
- Yearly Amount Spent
The linear_reg.ipynb
notebook covers:
-
Data preprocessing and exploration.
-
Building a regression model using LinearRegression from Scikit-Learn.
-
Evaluating model performance with metrics like Mean Squared Error and R-squared.
The xgboost_reg.ipynb
notebook includes:
-
Data preprocessing and exploration.
-
Building a regression model using XGBRegressor from the XGBoost library.
-
Fine-tuning the model with hyperparameter optimization.
-
Evaluating performance metrics for comparison with the linear regression model.
- Clone this repository:
git clone <repository_url>
cd <repository_folder>
- Start Jupyter Notebook:
jupyter notebook
- Open and run either
linear_reg.ipynb
orxgboost_reg.ipynb
.
The outputs include:
-
Insights into dataset relationships and trends using visualizations.
-
Model performance metrics and predictions.
Feel free to fork the repository, submit issues, or create pull requests to enhance the notebooks.
This project is licensed under the MIT License.