Skip to content

feat: simplified exo tutorial with history and future vars #652

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
302 changes: 302 additions & 0 deletions nbs/docs/tutorials/01_exogenous_variables_reworked.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,302 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#| hide\n",
"!pip install -Uqq nixtla"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#| hide\n",
"from nixtla.utils import in_colab"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#| hide\n",
"IN_COLAB = in_colab()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#| hide\n",
"if not IN_COLAB:\n",
" from nixtla.utils import colab_badge\n",
" from dotenv import load_dotenv"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## What Are Exogenous Variables?"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Exogenous variables or external factors are crucial in time series forecasting\n",
"as they provide additional information that might influence the prediction.\n",
"These variables could include holiday markers, marketing spending, weather data,\n",
"or any other external data that correlate with the time series data you are\n",
"forecasting.\n",
"\n",
"For example, if you're forecasting ice cream sales, temperature data could serve\n",
"as a useful exogenous variable. On hotter days, ice cream sales may increase."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## How to Use Exogenous Variables"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#| echo: false\n",
"if not IN_COLAB:\n",
" load_dotenv()\n",
" colab_badge('docs/tutorials/01_exogenous_variables_reworked')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"To incorporate exogenous variables in TimeGPT, you'll need to pair each point\n",
"in your time series data with the corresponding external data."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 1: Import Packages"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Import the required libraries and initialize the Nixtla client."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"from nixtla import NixtlaClient\n",
"\n",
"nixtla_client = NixtlaClient(\n",
" # defaults to os.environ.get(\"NIXTLA_API_KEY\")\n",
" api_key=\"my_api_key_provided_by_nixtla\"\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 2: Load Dataset"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this tutorial, we'll predict day-ahead electricity prices. The dataset contains:\n",
"\n",
"- Hourly electricity prices (`y`) from various markets (identified by `unique_id`)\n",
"- Exogenous variables (`Exogenous1` to `day_6`)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"df = pd.read_csv(\"https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short-with-ex-vars.csv\")\n",
"df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 3: Baseline Forecast without Exogenous Variables"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First, let's create a baseline forecast without using any exogenous variables."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"timegpt_fcst_no_ex_vars = nixtla_client.forecast(\n",
" df=df[[\"unique_id\", \"ds\", \"y\"]],\n",
" h=24,\n",
" level=[80, 90]\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 4: Forecasting electricity prices using exogenous variables"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, let's create a forecast using the exogenous variables. To make a forecast\n",
"using exogenous variables, you need to provide historical and future exogenous\n",
"values. Below is an example dataset containing future exogenous variables. Note\n",
"that it only contains the future exogenous variable values not the target\n",
"variable `y`. We need to forecast this target variable using the exogenous\n",
"variables provided."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"future_ex_vars_df = pd.read_csv(\"https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short-future-ex-vars.csv\")\n",
"future_ex_vars_df.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ensure you maintain consistent data formatting and columns in both historical\n",
"and future exogenous datasets (e.g., dates, unique_id, variable names)."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"timegpt_fcst_ex_vars = nixtla_client.forecast(\n",
" df=df,\n",
" X_df=future_ex_vars_df,\n",
" h=24,\n",
" level=[80, 90]\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Step 4: Forecast Visualization"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Once you have generated your forecasts, you can visualize the results to compare\n",
"forecasts between the two methods above."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"timegpt_fcst_no_ex_vars.rename(columns={\"TimeGPT\": \"TimeGPT_no_ex_vars\"}, inplace=True)\n",
"timegpt_fcst_ex_vars.rename(columns={\"TimeGPT\": \"TimeGPT_ex_vars\"}, inplace=True)\n",
"\n",
"all_forecasts = (\n",
" timegpt_fcst_no_ex_vars\n",
" .merge(\n",
" timegpt_fcst_ex_vars,\n",
" how='outer',\n",
" on=[\"unique_id\", \"ds\"]\n",
" )\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"nixtla_client.plot(\n",
" df[[\"unique_id\", \"ds\", \"y\"]],\n",
" all_forecasts,\n",
" max_insample_length=1000,\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Key Takeaways\n",
"\n",
"- Exogenous variables enrich time series forecasting.\n",
"- Ensure proper alignment of historical and future exogenous data.\n",
"\n",
"## Next Steps\n",
"\n",
" Congratulations! You have mastered the fundamentals of adding exogenous\n",
" variables to your TimeGPT forecasts. Keep refining your approach by\n",
" \n",
"- Exploring feature engineering to create domain-specific exogenous data.\n",
"- Experimenting with different modeling approaches for external variables.\n",
"- Validating forecast accuracy by comparing with real future data."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "python3",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Loading