ChronoGraph: A Real-World Graph-Based Multivariate Time Series Dataset

Authors: Adrian Catalin Lutu, Ioana Pintilie, Elena Burceanu, Andrei Manolache

Abstract: We present CHRONOGRAPH, a graph-structured multivariate time series forecasting dataset built from real-world production microservices. Each node is a service that emits a multivariate stream of system-level performance metrics, capturing CPU, memory, and network usage patterns, while directed edges encode dependencies between services. The primary task is forecasting future values of these signals at the service level. In addition, CHRONOGRAPH provides expert-annotated incident windows as anomaly labels, enabling evaluation of anomaly detection methods and assessment of forecast robustness during operational disruptions. Compared to existing benchmarks from industrial control systems or traffic and air-quality domains, CHRONOGRAPH uniquely combines (i) multivariate time series, (ii) an explicit, machine-readable dependency graph, and (iii) anomaly labels aligned with real incidents. We report baseline results spanning forecasting models, pretrained time-series foundation models, and standard anomaly detectors. CHRONOGRAPH offers a realistic benchmark for studying structure-aware forecasting and incident-aware evaluation in microservice systems.

arxiv preprint

Setup

The virtual environment used for development can be recreated using:

git clone https://github.com/bit-ml/ChronoGraph.git

python -m venv <env_name>

source <env_name>/bin/activate    # On macOS/Linux
<env_name>\Scripts\activate       # On Windows

pip install -r requirements.txt

ChronoGraph Dataset

ChronoGraph is a comprehensive, multi-variate temporal graph dataset designed for forecasting and anomaly detection in service-oriented architectures.

It models a real-world system of microservices, capturing both the internal health metrics of each service (nodes) and the interaction-level metrics between them (edges). The dataset includes expertly-labeled anomaly/service disruption events flagged by Bitdefender experts.

📈 Key Statistics

Graph Type: Directed, Temporal
Total Nodes: 708 (representing services)
Total Edges: 1529 (representing service-to-service connections)
Node Features: 5 temporal features per node
Edge Features: 8 temporal features per edge
Labels: Node-level anomaly/disruption labels

🔬 Feature Details

Node Features (Service-Level Metrics)

Each node's 5 temporal features track its internal health and resource consumption. These include metrics such as:

cpu_usage
container_memory_usage
...and 3 other service-level indicators.

Edge Features (Connection-Level Metrics)

Each edge's 8 temporal features track the quality and volume of interactions between two services. These include metrics such as:

total_requests
latency
Various return code frequencies (e.g., 2xx, 4xx, 5xx)

🗂️ Dataset Structure

The dataset is provided in three files, detailing the graph topology and the time-series data for all nodes and edges.

1. `edges.csv`

This file defines the static graph topology. It contains two columns:

source: The ID of the source service (node).
target: The ID of the target service (node).

2. `node_features.json`

This file provides the temporal features for each node (service) in a nested JSON structure. Each value is paired with a corresponding timestep.

{
  "<service_id>": {
    "<metric_name>": {
      "values": [], // List of metric values
      "steps": [] // List of corresponding timesteps
    }
    // ... other 4 metrics for this service
  }
  // ... other services
}

3. `edge_features_part{i}.json`

These files provide the temporal features for each edge (connection), using a source_id->target_id string as the primary key.

{
  "<source_id->target_id>": {
    "<metric_name>": {
      "values": [], // List of metric values
      "steps": [] // List of corresponding timesteps
    }
    // ... other 7 metrics for this connection
  }
  // ... other connections
}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
code		code
dataset		dataset
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

ChronoGraph: A Real-World Graph-Based Multivariate Time Series Dataset

Setup

ChronoGraph Dataset

📈 Key Statistics

🔬 Feature Details

Node Features (Service-Level Metrics)

Edge Features (Connection-Level Metrics)

🗂️ Dataset Structure

1. `edges.csv`

2. `node_features.json`

3. `edge_features_part{i}.json`

About

Uh oh!

Releases

Packages

Languages

License

bit-ml/ChronoGraph

Folders and files

Latest commit

History

Repository files navigation

ChronoGraph: A Real-World Graph-Based Multivariate Time Series Dataset

Setup

ChronoGraph Dataset

📈 Key Statistics

🔬 Feature Details

Node Features (Service-Level Metrics)

Edge Features (Connection-Level Metrics)

🗂️ Dataset Structure

1. edges.csv

2. node_features.json

3. edge_features_part{i}.json

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. `edges.csv`

2. `node_features.json`

3. `edge_features_part{i}.json`

Packages