Skip to content

Commit a5bd1b2

Browse files
DOC add AR model example (#129)
* DOC add AR model example
1 parent 8d0c5ee commit a5bd1b2

File tree

13 files changed

+2984
-2618
lines changed

13 files changed

+2984
-2618
lines changed

.github/workflows/emscripten.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,12 @@ jobs:
99
steps:
1010
- uses: actions/checkout@v4
1111
- name: Build WASM wheel
12-
uses: pypa/cibuildwheel@v3.0.1
12+
uses: pypa/cibuildwheel@v3.1.2
1313
env:
1414
CIBW_PLATFORM: pyodide
1515
- name: Upload package
1616
uses: actions/upload-artifact@v4
1717
with:
1818
name: wasm_wheel
1919
path: ./wheelhouse/*_wasm32.whl
20-
if-no-files-found: error
20+
if-no-files-found: error

.github/workflows/static.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ jobs:
99

1010
steps:
1111
- uses: actions/checkout@v4
12-
- uses: prefix-dev/setup-pixi@v0.8.11
12+
- uses: prefix-dev/setup-pixi@v0.8.14
1313
with:
1414
environments: static
1515
cache: true

.github/workflows/test.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ jobs:
1616

1717
steps:
1818
- uses: actions/checkout@v4
19-
- uses: prefix-dev/setup-pixi@v0.8.11
19+
- uses: prefix-dev/setup-pixi@v0.8.14
2020
with:
2121
environments: >-
2222
dev

.github/workflows/wheel.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ jobs:
88
runs-on: ubuntu-latest
99
steps:
1010
- uses: actions/checkout@v4
11-
- uses: prefix-dev/setup-pixi@v0.8.11
11+
- uses: prefix-dev/setup-pixi@v0.8.14
1212
with:
1313
environments: dev
1414
cache: true
@@ -33,9 +33,9 @@ jobs:
3333
steps:
3434
- uses: actions/checkout@v4
3535
- name: Build wheels
36-
uses: pypa/cibuildwheel@v3.0.1
36+
uses: pypa/cibuildwheel@v3.1.2
3737
env:
38-
CIBW_SKIP: "*_i686 *_ppc64le *_s390x *_universal2 *-musllinux_*"
38+
CIBW_SKIP: "*_i686 *_ppc64le *_s390x *_universal2 *-musllinux_* cp314*"
3939
CIBW_PROJECT_REQUIRES_PYTHON: ">=3.10"
4040
CIBW_ARCHS_LINUX: auto
4141
CIBW_ARCHS_MACOS: x86_64 arm64

README.rst

Lines changed: 16 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -69,19 +69,27 @@ Getting Started
6969
... [-2.79, -0.02, -0.85 ],
7070
... [-1.34, -0.48, -2.55 ],
7171
... [ 1.92, 1.48, 0.65 ]]
72-
>>> y = [[0, 0], [1, 1], [0, 0], [1, 0]] # Multioutput feature selection
73-
>>> selector = FastCan(n_features_to_select=2, verbose=0).fit(X, y)
72+
>>> # Multioutput feature selection
73+
>>> y = [[0, 0], [1, 1], [0, 0], [1, 0]]
74+
>>> selector = FastCan(
75+
... n_features_to_select=2, verbose=0
76+
... ).fit(X, y)
7477
>>> selector.get_support()
7578
array([ True, True, False])
76-
>>> selector.get_support(indices=True) # Sorted indices
79+
>>> # Sorted indices
80+
>>> selector.get_support(indices=True)
7781
array([0, 1])
78-
>>> selector.indices_ # Indices in selection order
82+
>>> # Indices in selection order
83+
>>> selector.indices_
7984
array([1, 0], dtype=int32)
80-
>>> selector.scores_ # Scores for selected features in selection order
85+
>>> # Scores for selected features in selection order
86+
>>> selector.scores_
8187
array([0.91162413, 0.71089547])
8288
>>> # Here Feature 2 must be included
83-
>>> selector = FastCan(n_features_to_select=2, indices_include=[2], verbose=0).fit(X, y)
84-
>>> # We can find the feature which is useful when working with Feature 2
89+
>>> selector = FastCan(
90+
... n_features_to_select=2, indices_include=[2], verbose=0
91+
... ).fit(X, y)
92+
>>> # The feature which is useful when working with Feature 2
8593
>>> selector.indices_
8694
array([2, 0], dtype=int32)
8795
>>> selector.scores_
@@ -92,7 +100,7 @@ NARX Time Series Modelling
92100
--------------------------
93101
fastcan can be used for system identification.
94102
In particular, we provide a submodule `fastcan.narx` to build Nonlinear AutoRegressive eXogenous (NARX) models.
95-
For more information, check our `Home Page <https://fastcan.readthedocs.io/en/latest/>`_.
103+
For more information, check this `NARX model example <https://fastcan.readthedocs.io/en/latest/auto_examples/plot_narx.html>`_.
96104

97105

98106
Support Free-Threaded Wheels

doc/narx.rst

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -52,10 +52,11 @@ No matter how the discontinuity is caused, :class:`NARX` treats the discontinuou
5252
It is easy to notify :class:`NARX` that the time series data are from different measurement sessions by inserting np.nan to break the data. For example,
5353

5454
>>> import numpy as np
55-
>>> x0 = np.zeros((3, 2)) # First measurement session has 3 samples with 2 features
56-
>>> x1 = np.ones((5, 2)) # Second measurement session has 5 samples with 2 features
55+
>>> x0 = np.zeros((3, 2)) # First measurement session
56+
>>> x1 = np.ones((5, 2)) # Second measurement session
5757
>>> max_delay = 2 # Assume the maximum delay for NARX model is 2
58-
>>> u = np.r_[x0, [[np.nan, np.nan]]*max_delay, x1] # Insert (at least max_delay number of) np.nan to break the two measurement sessions
58+
>>> # Insert (at least max_delay number of) np.nan to break the two measurement sessions
59+
>>> u = np.r_[x0, [[np.nan, np.nan]]*max_delay, x1]
5960

6061
It is important to break the different measurement sessions by np.nan, because otherwise,
6162
the model will assume the time interval between the two measurement sessions is the same as the time interval within a session.

examples/plot_forecasting.py

Lines changed: 142 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,142 @@
1+
"""
2+
======================================
3+
Forecasting with (Nonlinear) AR models
4+
======================================
5+
6+
.. currentmodule:: fastcan.narx
7+
8+
In this examples, we will demonstrate how to use :func:`make_narx` to build (nonlinear)
9+
AutoRegressive (AR) models for time-series forecasting.
10+
The time series used isthe monthly average atmospheric CO2 concentrations
11+
from 1958 and 2001.
12+
The objective is to forecast the CO2 concentration till nowadays with
13+
initial 18 months data.
14+
15+
.. rubric:: Credit
16+
17+
* The majority of code is adapted from the scikit-learn tutorial
18+
`Forecasting of CO2 level on Mona Loa dataset using Gaussian process regression (GPR)
19+
<https://scikit-learn.org/stable/auto_examples/gaussian_process/plot_gpr_co2.html>`_.
20+
"""
21+
22+
# Authors: The fastcan developers
23+
# SPDX-License-Identifier: MIT
24+
25+
# %%
26+
# Prepare data
27+
# ------------
28+
# We use the data consists of the monthly average atmospheric CO2 concentrations
29+
# (in parts per million by volume (ppm)) collected at the Mauna Loa Observatory
30+
# in Hawaii, between 1958 and 2001.
31+
32+
from sklearn.datasets import fetch_openml
33+
34+
co2 = fetch_openml(data_id=41187, as_frame=True)
35+
co2 = co2.frame
36+
co2.head()
37+
38+
# %%
39+
# First, we process the original dataframe to create a date column and select it along
40+
# with the CO2 column. Here, date columns is used only for plotting, as there is no
41+
# inputs (including time or date) required in AR models.
42+
43+
import pandas as pd
44+
45+
co2_data = co2[["year", "month", "day", "co2"]].assign(
46+
date=lambda x: pd.to_datetime(x[["year", "month", "day"]])
47+
)[["date", "co2"]]
48+
co2_data.head()
49+
50+
51+
# %%
52+
# The CO2 concentration are from March, 1958 to December, 2001, which
53+
# is shown in the plot below.
54+
55+
import matplotlib.pyplot as plt
56+
57+
plt.plot(co2_data["date"], co2_data["co2"])
58+
plt.xlabel("date")
59+
plt.ylabel("CO$_2$ concentration (ppm)")
60+
_ = plt.title("Raw air samples measurements from the Mauna Loa Observatory")
61+
62+
# %%
63+
# We will preprocess the dataset by taking a monthly average to smooth the data.
64+
# The months which have no measurements were collected should not be dropped.
65+
# Because AR models require the time intervals between the two neighboring measurements
66+
# are consistent.
67+
# As the results, the NaN values should be kept as the placeholders to maintain the
68+
# time intervals, and :class:`NARX` can handle the missing values properly.
69+
70+
co2_data = co2_data.set_index("date").resample("ME")["co2"].mean().reset_index()
71+
plt.plot(co2_data["date"], co2_data["co2"])
72+
plt.xlabel("date")
73+
plt.ylabel("Monthly average of CO$_2$ concentration (ppm)")
74+
_ = plt.title(
75+
"Monthly average of air samples measurements\nfrom the Mauna Loa Observatory"
76+
)
77+
78+
# %%
79+
# For plotting, the time axis for training is from March, 1958 to December, 2001,
80+
# which is converted it into a numeric, e.g., March, 1958 will be converted to 1958.25.
81+
# The time axis for test is from March, 1958 to nowadays.
82+
83+
import datetime
84+
85+
import numpy as np
86+
87+
today = datetime.datetime.now()
88+
current_month = today.year + today.month / 12
89+
90+
x_train = (co2_data["date"].dt.year + co2_data["date"].dt.month / 12).to_numpy()
91+
x_test = np.arange(start=1958.25, stop=current_month, step=1 / 12)
92+
93+
# %%
94+
# Nonlinear AR model
95+
# ------------------
96+
# We can use :func:`make_narx` to easily build a nonlinear AR model, which does not
97+
# has a input. Therefore, the input ``X`` is set as ``None``.
98+
# :func:`make_narx` will search 10 polynomial terms, whose maximum degree is 2 and
99+
# maximum delay is 9.
100+
101+
from fastcan.narx import make_narx, print_narx
102+
103+
max_delay = 9
104+
model = make_narx(
105+
None,
106+
co2_data["co2"],
107+
n_terms_to_select=10,
108+
max_delay=max_delay,
109+
poly_degree=2,
110+
verbose=0,
111+
)
112+
model.fit(None, co2_data["co2"], coef_init="one_step_ahead")
113+
print_narx(model, term_space=27)
114+
115+
# %%
116+
# Forecasting performance
117+
# -----------------------
118+
# As AR model does not require input data, the input ``X`` in :func:`predict`
119+
# is used to indicate total steps to forecast. The initial conditions ``y_init``
120+
# is the first 18 months data from the ground truth, which contains missing values.
121+
# If there is no missing value given to ``y_init``, we can only use ``max_delay``
122+
# number of samples as the initial conditions.
123+
# However, if missing values are given to ``y_init``, :class:`NARX` will break
124+
# the data into multiple time series according to the missing values. For each
125+
# time series, at least ``max_delay`` number of samples, which does not have
126+
# missing values, are required to do the proper forecasting.
127+
# The results show our fitted model is capable to forecast to future years with
128+
# only first 18 months data.
129+
130+
y_pred = model.predict(
131+
len(x_test),
132+
y_init=co2_data["co2"][:18],
133+
)
134+
plt.plot(x_test, y_pred, label="Predicted", c="tab:orange")
135+
plt.plot(x_train, co2_data["co2"], label="Actual", linestyle="dashed", c="tab:blue")
136+
plt.legend()
137+
plt.xlabel("Year")
138+
plt.ylabel("Monthly average of CO$_2$ concentration (ppm)")
139+
_ = plt.title(
140+
"Monthly average of air samples measurements\nfrom the Mauna Loa Observatory"
141+
)
142+
plt.show()

examples/plot_narx_msa.py

Lines changed: 53 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -33,8 +33,11 @@
3333

3434
def duffing_equation(y, t):
3535
"""Non-autonomous system"""
36+
# y1 is displacement and y2 is velocity
3637
y1, y2 = y
38+
# u is sinusoidal input
3739
u = 2.5 * np.cos(2 * np.pi * t)
40+
# dydt is derivative of y1 and y2
3841
dydt = [y2, -0.1 * y2 + y1 - 0.25 * y1**3 + u]
3942
return dydt
4043

@@ -79,20 +82,26 @@ def auto_duffing_equation(y, t):
7982
# In the phase portraits, it is shown that the system has two stable equilibria.
8083
# We use one to generate training data and the other to generate test data.
8184

85+
# 10 s duration with 0.01 Hz sampling time,
86+
# so 1000 samples in total for each measurement
8287
dur = 10
8388
n_samples = 1000
89+
t = np.linspace(0, dur, n_samples)
90+
# External excitation is the same for each measurement
91+
u = 2.5 * np.cos(2 * np.pi * t).reshape(-1, 1)
8492

93+
# Small additional white noise
8594
rng = np.random.default_rng(12345)
86-
e_train = rng.normal(0, 0.0004, n_samples)
95+
e_train_0 = rng.normal(0, 0.0004, n_samples)
8796
e_test = rng.normal(0, 0.0004, n_samples)
88-
t = np.linspace(0, dur, n_samples)
8997

98+
# Solve differential equation to get displacement as y
99+
# Initial condition at displacement 0.6 and velocity 0.8
90100
sol = odeint(duffing_equation, [0.6, 0.8], t)
91-
u_train = 2.5 * np.cos(2 * np.pi * t).reshape(-1, 1)
92-
y_train = sol[:, 0] + e_train
101+
y_train_0 = sol[:, 0] + e_train_0
93102

103+
# Initial condition at displacement 0.6 and velocity -0.8
94104
sol = odeint(duffing_equation, [0.6, -0.8], t)
95-
u_test = 2.5 * np.cos(2 * np.pi * t).reshape(-1, 1)
96105
y_test = sol[:, 0] + e_test
97106

98107
# %%
@@ -111,8 +120,8 @@ def auto_duffing_equation(y, t):
111120
max_delay = 3
112121

113122
narx_model = make_narx(
114-
X=u_train,
115-
y=y_train,
123+
X=u,
124+
y=y_train_0,
116125
n_terms_to_select=5,
117126
max_delay=max_delay,
118127
poly_degree=3,
@@ -129,17 +138,19 @@ def plot_prediction(ax, t, y_true, y_pred, title):
129138
ax.set_ylabel("y(t)")
130139

131140

132-
narx_model.fit(u_train, y_train)
133-
y_train_osa_pred = narx_model.predict(u_train, y_init=y_train[:max_delay])
134-
y_test_osa_pred = narx_model.predict(u_test, y_init=y_test[:max_delay])
141+
# OSA NARX
142+
narx_model.fit(u, y_train_0)
143+
y_train_0_osa_pred = narx_model.predict(u, y_init=y_train_0[:max_delay])
144+
y_test_osa_pred = narx_model.predict(u, y_init=y_test[:max_delay])
135145

136-
narx_model.fit(u_train, y_train, coef_init="one_step_ahead")
137-
y_train_msa_pred = narx_model.predict(u_train, y_init=y_train[:max_delay])
138-
y_test_msa_pred = narx_model.predict(u_test, y_init=y_test[:max_delay])
146+
# MSA NARX
147+
narx_model.fit(u, y_train_0, coef_init="one_step_ahead")
148+
y_train_0_msa_pred = narx_model.predict(u, y_init=y_train_0[:max_delay])
149+
y_test_msa_pred = narx_model.predict(u, y_init=y_test[:max_delay])
139150

140151
fig, ax = plt.subplots(2, 2, figsize=(8, 6))
141-
plot_prediction(ax[0, 0], t, y_train, y_train_osa_pred, "OSA NARX on Train")
142-
plot_prediction(ax[0, 1], t, y_train, y_train_msa_pred, "MSA NARX on Train")
152+
plot_prediction(ax[0, 0], t, y_train_0, y_train_0_osa_pred, "OSA NARX on Train 0")
153+
plot_prediction(ax[0, 1], t, y_train_0, y_train_0_msa_pred, "MSA NARX on Train 0")
143154
plot_prediction(ax[1, 0], t, y_test, y_test_osa_pred, "OSA NARX on Test")
144155
plot_prediction(ax[1, 1], t, y_test, y_test_msa_pred, "MSA NARX on Test")
145156
fig.tight_layout()
@@ -152,14 +163,21 @@ def plot_prediction(ax, t, y_true, y_pred, title):
152163
#
153164
# The plot above shows that the NARX model cannot capture the dynamics at
154165
# the left equilibrium shown in the phase portraits. To improve the performance, let us
155-
# combine the training and test data for model training to include the dynamics of both
156-
# equilibria. Here, we need to insert (at least max_delay number of) `np.nan` to
157-
# indicate the model that training data
158-
# and test data are from different measurement sessions. The plot shows that the
166+
# append another measurement session to the training data to include the dynamics of
167+
# both equilibria. Here, we need to insert (at least max_delay number of) `np.nan` to
168+
# indicate the model that the original training data and the appended data
169+
# are from different measurement sessions. The plot shows that the
159170
# prediction performance of the NARX on test data has been largely improved.
160171

161-
u_all = np.r_[u_train, [[np.nan]] * max_delay, u_test]
162-
y_all = np.r_[y_train, [np.nan] * max_delay, y_test]
172+
e_train_1 = rng.normal(0, 0.0004, n_samples)
173+
174+
# Solve differential equation to get displacement as y
175+
# Initial condition at displacement 0.5 and velocity -1
176+
sol = odeint(duffing_equation, [0.5, -1], t)
177+
y_train_1 = sol[:, 0] + e_train_1
178+
179+
u_all = np.r_[u, [[np.nan]] * max_delay, u]
180+
y_all = np.r_[y_train_0, [np.nan] * max_delay, y_train_1]
163181
narx_model = make_narx(
164182
X=u_all,
165183
y=y_all,
@@ -170,17 +188,21 @@ def plot_prediction(ax, t, y_true, y_pred, title):
170188
)
171189

172190
narx_model.fit(u_all, y_all)
173-
y_train_osa_pred = narx_model.predict(u_train, y_init=y_train[:max_delay])
174-
y_test_osa_pred = narx_model.predict(u_test, y_init=y_test[:max_delay])
191+
y_train_0_osa_pred = narx_model.predict(u, y_init=y_train_0[:max_delay])
192+
y_train_1_osa_pred = narx_model.predict(u, y_init=y_train_1[:max_delay])
193+
y_test_osa_pred = narx_model.predict(u, y_init=y_test[:max_delay])
175194

176195
narx_model.fit(u_all, y_all, coef_init="one_step_ahead")
177-
y_train_msa_pred = narx_model.predict(u_train, y_init=y_train[:max_delay])
178-
y_test_msa_pred = narx_model.predict(u_test, y_init=y_test[:max_delay])
179-
180-
fig, ax = plt.subplots(2, 2, figsize=(8, 6))
181-
plot_prediction(ax[0, 0], t, y_train, y_train_osa_pred, "OSA NARX on Train")
182-
plot_prediction(ax[0, 1], t, y_train, y_train_msa_pred, "MSA NARX on Train")
183-
plot_prediction(ax[1, 0], t, y_test, y_test_osa_pred, "OSA NARX on Test")
184-
plot_prediction(ax[1, 1], t, y_test, y_test_msa_pred, "MSA NARX on Test")
196+
y_train_0_msa_pred = narx_model.predict(u, y_init=y_train_0[:max_delay])
197+
y_train_1_msa_pred = narx_model.predict(u, y_init=y_train_1[:max_delay])
198+
y_test_msa_pred = narx_model.predict(u, y_init=y_test[:max_delay])
199+
200+
fig, ax = plt.subplots(3, 2, figsize=(8, 9))
201+
plot_prediction(ax[0, 0], t, y_train_0, y_train_0_osa_pred, "OSA NARX on Train 0")
202+
plot_prediction(ax[0, 1], t, y_train_0, y_train_0_msa_pred, "MSA NARX on Train 0")
203+
plot_prediction(ax[1, 0], t, y_train_1, y_train_1_osa_pred, "OSA NARX on Train 1")
204+
plot_prediction(ax[1, 1], t, y_train_1, y_train_1_msa_pred, "MSA NARX on Train 1")
205+
plot_prediction(ax[2, 0], t, y_test, y_test_osa_pred, "OSA NARX on Test")
206+
plot_prediction(ax[2, 1], t, y_test, y_test_msa_pred, "MSA NARX on Test")
185207
fig.tight_layout()
186208
plt.show()

0 commit comments

Comments
 (0)