This study examines the effectiveness of transformer-based models for financial time series forecasting, specifically focusing on log returns derived from daily closing prices of the DAX40 index. We propose a decoder-only transformer model specifically designed for immediate-term financial time series forecasting: The PatternDecoder architecture addresses the specific requirements of short-sequence financial forecasting and is evaluated with full, convolutional, and Auto-Correlation attention mechanisms. Various transformer architectures are evaluated alongside for their predictive performance, including the standard Transformer encoder and its specialized variations Informer and Autoformer, that are designed to improve performance on time series data.
Time series forecasting is a critical task in financial analysis, enabling traders, investors, and financial institutions to make informed decisions based on predicted market movements. Traditional statistical methods like ARIMA and exponential smoothing have limitations in capturing complex non-linear patterns and short-term dependencies in financial data. Deep learning approaches, particularly transformer-based architectures, offer promising alternatives due to their ability to model sequential data effectively.
The Transformer model, initially introduced in Attention is All You Need, leverages self-attention mechanisms to capture short-term dependencies and complex patterns in sequential data. Unlike recurrent neural networks, transformers process the entire sequence simultaneously, allowing them to capture relationships between any positions in the sequence regardless of their distance. The key components include:
- Multi-Head Attention: Allows the model to jointly attend to information from different representation subspaces
- Positional Encoding: Provides position information since transformers lack inherent sequential processing
- Feed-Forward Networks: Process the attention output through non-linear transformations
In this study, the Transformer is evaluated with full, convolutional, and LogSparse attention mechanisms.
The Time Series Transformer, as proposed in Enhancing the Locality and Breaking the Memory Bottleneck of Transformer on Time Series Forecasting, enhances the standard transformer architecture by addressing challenges specific to time series data:
- Convolutional Self-Attention: Incorporates local pattern recognition capabilities using convolutional operations
-
LogSparse Attention: Reduces memory complexity from
$O(L^2)$ to$O(L(\log L)^2)$ by using a logarithmic sampling strategy - Memory Efficiency: Enables processing of longer sequences with limited computational resources
The Informer architecture, introduced in Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting, aims to improve transformer efficiency for long sequence time-series forecasting through:
-
ProbSparse Attention: Reduces computational complexity to
$O(L(\log L))$ by focusing on the most important query-key pairs - Distilling Mechanism: Progressively halves the sequence length at each layer through convolutional operations
- Direct Multi-step Forecasting: Predicts the entire output sequence in one forward pass
The Autoformer, as presented in Autoformer: Decomposition Transformers with Auto-Correlation for Long-Term Series Forecasting, introduces innovative components for time series modeling:
- Series Decomposition: Separates time series into trend and seasonal components using moving average filters
- Auto-Correlation Mechanism: Replaces self-attention with an auto-correlation mechanism that captures lag-based periodicities
- Progressive Decomposition: Uses depth-decay to progressively reduce the influence of the trend component at deeper layers
Decoder-only transformers are neural network architectures composed exclusively of stacked decoder blocks, each containing masked self-attention and feed-forward layers, optimized for autoregressive text generation and time series tasks. This architecture underpins large language models like GPT-3 and GPT-4, enabling efficient generation of coherent and contextually relevant information through mechanisms such as residual connections and layer normalization. (see Q. Chen (2025) )
A decoder-only transformer model specifically designed for immediate-term financial time series forecasting. The PatternDecoder architecture addresses the specific requirements of short-sequence financial forecasting and is evaluated with full, convolutional, and Auto-Correlation attention mechanisms.
For the study, Naive and
To run the code in this repository, you need to clone the repository and install the package and its dependencies:
git clone https://github.com/saschque/patterndecoder.git
cd patterndecoder
python setup.py install
# or
# make init
The study uses historical daily closing prices of the DAX40 index spanning from March 28, 2022, to March 28, 2025, providing 768 trading days of information. The data is preprocessed by:
- Converting prices to daily log returns using the formula
$r_t = \ln\left(\frac{P_t}{P_{t-1}}\right)$ to ensure stationarity, as confirmed by ADF and KPSS tests - Using a
$30$ -day input window ($L=w=30$ ) for 1-day ahead ($h=1$ ) prediction - Splitting into training (
$70%$ ) and testing ($30%$ ) sets, yielding$537$ training observations and$231$ test observations
You can use the provided dataset in the data
directory or download updated data using the commented code in the notebook.
You may also use a custom dataset. Make sure to align your dataset with the preprocessing methods used in this project.
- Clone the repository and install the package and its dependencies:
git clone https://github.com/saschque/patterndecoder.git
cd patterndecoder
python setup.py install
# or
# make init
- Run the Jupyter notebook to train and evaluate models:
jupyter notebook time_series_transformer_evaluation_paper.ipynb
- To train a specific model (e.g., Transformer with full attention):
from patterndecoder.utils import load_config, load_model, compile_and_train
import tensorflow as tf
tf.keras.backend.clear_session()
# Load parameters and hyperparameters from config.json
params = load_config("config/config.yaml")
# This approach uses a time-series transformer utilizing full attention
transformer_model = load_model(params, "models/transformer.yaml")
# Compile and train the model
__, transformer_model = compile_and_train(transformer_model, train_data)
The models are evaluated based on their Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) performance across training and testing datasets. The study uses Huber loss as the loss function and AdamW optimizer with learning rate 1×10⁻⁴. This study provides insights into the ability of transformer-based models and their variations to handle short-term dependencies, model complex temporal dynamics, and improve forecasting accuracy in immediate-term financial time series prediction. It offers a comparative perspective on the performance of these models relative to traditional methods and other deep learning techniques in the context of financial time series forecasting.
setup.py
: Setup file to create the python project packagepatterndecoder
config/
: Directory containing the settings fileconfig.yaml
data/
: Directory containing the daily market datadocs/
: Directory containing supporting documentationmodels/
: Directory containing the model configs and the weights of the pre-trained modelspatterndecoder/
: Directory containing the modules of the packagepatterndecoder/attention.py
: Implementation of various attention mechanismspatterndecoder/embedding.py
: Implementation of various embedding mechanismspatterndecoder/transformer.py
: Implementation of the base Transformer modelpatterndecoder/informer.py
: Implementation of the Informer modelpatterndecoder/autoformer.py
: Implementation of the Autoformer modelpatterndecoder/patterndecoder.py
: Implementation of the PatternDecoder modelpatterndecoder/utils.py
: Utility functions for data processing and evaluationpatterndecoder_evaluation_study.ipynb
: Main notebook for the evaluation
This project is licensed under the Apache-2.0 license - see the LICENSE
file for details.