[WIP] Remaining TSDS edits #3222

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft

marciw wants to merge 6 commits into main from mw-tsds-final-countdown

+399 −406

Contributor

marciw commented Sep 30, 2025 •

edited

Loading

Work in progress

Part of https://github.com/elastic/docs-team/issues/31?issue=elastic%7Cdocs-team%7C41

Status
🟢 Ready for PM/engineer review
🚧 Not ready for tech writer review

❗ Note for reviewers: We're going for "MVP" docs for now and tracking additional improvements in #3179

Changes

Revised overview: simplified, clarified
Revised setup: removed component templates, simplified
New advanced section (reindex, advanced concepts)

TODO:

Reconcile with recent changes to general data stream docs
Check docs patterns/style/etc.


          Remaining edits

95d852c

github-actions bot deployed to docs-preview

September 30, 2025 00:49

View deployment


          Merge branch 'main' into mw-tsds-final-countdown

1069b85

github-actions bot deployed to docs-preview

September 30, 2025 00:52

View deployment

marciw added 2 commits

September 29, 2025 21:03


          temp anchors

b68069d


          Merge branch 'mw-tsds-final-countdown' of https://github.com/elastic/…

0387a4e

…docs-content into mw-tsds-final-countdown

github-actions bot deployed to docs-preview

September 30, 2025 01:04

View deployment


          temp anchor

9bbda53

github-actions bot deployed to docs-preview

September 30, 2025 01:07

View deployment

github-actions bot commented Sep 30, 2025 •

edited

Loading

🔍 Preview links for changed docs

marciw requested review from yannis-roussos and kkrik-es

September 30, 2025 13:24

kkrik-es requested a review from gmarouli

September 30, 2025 13:27

marciw mentioned this pull request

Document index.dimensions-based routing #3229

Closed


          Document index.dimensions-based routing

d4fe9bc

github-actions bot deployed to docs-preview

September 30, 2025 15:21

View deployment

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/reindex-tsds.md

    
              ### Create the destination data stream and reindex [tsds-reindex-op]

              Invoke the reindex api, for instance:

              Run the reindex operation using `op_type: create` to prevent overwrites:

Contributor

kkrik-es Oct 1, 2025

Maybe skip to prevent overwrites ? This is a new data stream, so no overwriting is possible?

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/time-series-data-stream-tsds.md

    
              Both a regular data stream and a time series data stream can store timestamped metrics data. 

              Use a time series data stream for metrics data only. For other timestamped data, such as logs or traces, use a [logs data stream](logs-data-stream.md) or regular data stream.

              Choose a time series data stream if you typically add metrics data to {{es}} in near real-time and in `@timestamp` order. For other timestamped data, such as logs or traces, use a [logs data stream](logs-data-stream.md) or [regular data stream](/manage-data/data-store/data-streams.md).

Contributor

kkrik-es Oct 1, 2025

Consider expanding what metrics data means. Here, we're looking for a sequence of data point-timestamp pairs, identified by one or more dimension fields that can be used for slicing in aggregation queries.

Contributor

kkrik-es Oct 1, 2025

This is nicely described in Time-series concepts below. Maybe add a cross reference, or have this section follow that one?

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/time-series-data-stream-tsds.md

    
              * **Required fields:** In a TSDS, each document contains:

                * A `@timestamp` field

                * One or more [dimension fields](#time-series-dimension), set with `time_series_dimension: true`  

                * One or more [metric fields](#time-series-metric) (not strictly required, but typical for a TSDS)

Contributor

kkrik-es Oct 1, 2025

Let's remove the not strictly required part. A time-series requires a metric field with non-null values.

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/time-series-data-stream-tsds.md

    
                * One or more [dimension fields](#time-series-dimension), set with `time_series_dimension: true`  

                * One or more [metric fields](#time-series-metric) (not strictly required, but typical for a TSDS)

              * **Document IDs:** Time series documents use two IDs: 

                  * An internal [`_tsid`](#tsid) metadata field, generated by {{es}} for each document in a TSDS and used for sorting and compression

Contributor

kkrik-es Oct 1, 2025

Still thinking whether we need to expose the _tsid so readily in our documentation.. If we do, we probably want to mention that it's calculated over all dimension values.

Another option is to have a section towards the end, shedding some light into how data gets structured internally.

Contributor

gmarouli Oct 1, 2025

+1, here I would mention that the id is calculated by es and cannot be provided, and if we want to elaborate more it should be in an implementation section. The reason is that I see it as an implementation detail and not as part of the API, if that makes sense.

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/time-series-data-stream-tsds.md

    
                * One or more [metric fields](#time-series-metric) (not strictly required, but typical for a TSDS)

              * **Document IDs:** Time series documents use two IDs: 

                  * An internal [`_tsid`](#tsid) metadata field, generated by {{es}} for each document in a TSDS and used for sorting and compression

                  * The document `_id`, a generated hash of the document's dimensions and `@timestamp` (custom `_id` values are not supported)

Contributor

kkrik-es Oct 1, 2025

Maybe mention that it's auto-generated - passing a doc id during indexing results to an error.

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/time-series-data-stream-tsds.md

    
                  * An internal [`_tsid`](#tsid) metadata field, generated by {{es}} for each document in a TSDS and used for sorting and compression

                  * The document `_id`, a generated hash of the document's dimensions and `@timestamp` (custom `_id` values are not supported)

              * **Backing indices:** A TSDS uses [time-bound indices](/manage-data/data-store/data-streams/time-bound-tsds.md) to store data from the same time period in the same backing index.

              * **Dimension-based routing:** The matching index template for a TSDS must contain the `index.routing_path` index setting, which specifies dimensions for routing documents to shards.

Contributor

kkrik-es Oct 1, 2025

This is not correct, the setting gets auto-generated if not present in the templates. We actually prefer to auto-generate, and it gets replaced by a different, internal setting that users can no longer touch.

It may suffice to note here that routing logic uses dimension field values to map data to shards per time series, improving storage efficiency and query performance.

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/time-series-data-stream-tsds.md

    
              * **Backing indices:** A TSDS uses [time-bound indices](/manage-data/data-store/data-streams/time-bound-tsds.md) to store data from the same time period in the same backing index.

              * **Dimension-based routing:** The matching index template for a TSDS must contain the `index.routing_path` index setting, which specifies dimensions for routing documents to shards.

              * **Sorting:** A TSDS uses internal [index sorting](elasticsearch://reference/elasticsearch/index-settings/sorting.md) to order shard segments by `_tsid` and `@timestamp`, for better compression. Time series data streams do not use `index.sort.*` settings.

              * **Synthetic source:** A TSDS uses [synthetic `_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source), which has some [restrictions](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-restrictions) and [modifications](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications).

Contributor

kkrik-es Oct 1, 2025

I wonder if we should be mentioning this here.. Synthetic source is orthogonal to TSDS these days, only available for enterprise license. When available, it reduces the storage footprint with no loss of functionality, but everything works without it too.

Contributor

kkrik-es Oct 1, 2025

One difference, though, is that it's not possible to disable source for a TSDS. It's either standard or synthetic source mode.

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/time-series-data-stream-tsds.md

    
              * A TSDS uses internal [index sorting](elasticsearch://reference/elasticsearch/index-settings/sorting.md) to order shard segments by `_tsid` and `@timestamp`.

              * TSDS documents only support auto-generated document `_id` values. For TSDS documents, the document `_id` is a hash of the document’s dimensions and `@timestamp`. A TSDS doesn’t support custom document `_id` values.

              * A TSDS uses [synthetic `_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source), and as a result is subject to some [restrictions](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-restrictions) and [modifications](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications) applied to the `_source` field.

              You can use the {{esql}} [`TS` command](elasticsearch://reference/query-languages/esql/commands/ts.md) to query time series data streams. The `TS` command is optimized for time series data. It also enables the use of aggregation functions that efficiently process metrics per time series, before aggregating results.

Contributor

kkrik-es Oct 1, 2025

Shall we mention that it's in tech preview?

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/time-series-data-stream-tsds.md

    
              :::

              In a TSDS, each {{es}} document represents an observation, or data point, in a specific time series. Although a TSDS can contain multiple time series, a document can only belong to one time series. A time series can’t span multiple data streams.

              In a TSDS, each {{es}} document represents an observation, or data point, in a specific time series. Although a TSDS can contain multiple time series, a document can belong to only one time series. A single time series can't span multiple data streams.

Contributor

kkrik-es Oct 1, 2025

That's not 100% correct. The proper definition of a time series includes the metric name. Since we can have multiple metric fields populated in a single doc, these map to multiple time series. @felixbarny too for thoughts here.

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/time-series-data-stream-tsds.md

    
              ### Time series fields

              Compared to a regular data stream, a TSDS uses some additional fields specific to time series:  dimension fields (required) and metric fields (optional but usually defined), plus an internal `_tsid` metadata field.

Contributor

kkrik-es Oct 1, 2025

Ditto, let's not call metric fields optional.

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/time-series-data-stream-tsds.md

    
              A TSDS document is uniquely identified by its time series and timestamp, both of which are used to generate the document `_id`. So, two documents with the same dimensions and the same timestamp are considered to be duplicates. When you use the `_bulk` endpoint to add documents to a TSDS, a second document with the same timestamp and dimensions overwrites the first. When you use the `PUT /<target>/_create/<_id>` format to add an individual document and a document with the same `_id` already exists, an error is generated.

              :::{tip}

              {{es}} uses dimensions and timestamps to generate time series document `_id` values. Two documents with the same dimensions and timestamp are considered duplicates.

Contributor

kkrik-es Oct 1, 2025

This renders the reference on _id above redundant, imho. Let's just keep this one.

kkrik-es requested a review from felixbarny

October 1, 2025 06:54

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/time-series-data-stream-tsds.md

    
              To work with a flattened field, use the `time_series_dimensions` parameter to configure an array of fields as dimensions. For details, refer to [`flattened`](elasticsearch://reference/elasticsearch/mapping-reference/flattened.md#flattened-params).

              You can also simplify dimension definitions by using [pass-through](elasticsearch://reference/elasticsearch/mapping-reference/passthrough.md#passthrough-dimensions) fields.

              :::

Contributor

kkrik-es Oct 1, 2025

Let's not hide this, it's probably the simplest and recommended way to define dimensions.

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/time-series-data-stream-tsds.md

    
              #### Metrics [time-series-metric]

              Metrics are fields that contain numeric measurements, as well as aggregations and/or downsampling values based off of those measurements. While not required, documents in a TSDS typically contain one or more metric fields.

              Metrics are numeric measurements that change over time. Although metrics are not required, documents in a TSDS typically contain one or more metric fields.

Contributor

kkrik-es Oct 1, 2025

Suggested change

      
            Metrics are numeric measurements that change over time. Although metrics are not required, documents in a TSDS typically contain one or more metric fields. 
          
            Metrics are numeric measurements that change over time. Documents in a TSDS typically contain one or more metric fields.

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/time-series-data-stream-tsds.md

    
              To mark a field as a metric, you must specify a metric type using the `time_series_metric` mapping parameter. The following field types support the `time_series_metric` parameter:

              To mark a field as a metric, use the `time_series_metric` mapping parameter. This parameter ensures data is stored in an optimal way for time series analysis. The following field types support the `time_series_metric` parameter:

              * [`aggregate_metric_double`](elasticsearch://reference/elasticsearch/mapping-reference/aggregate-metric-double.md)

Contributor

kkrik-es Oct 1, 2025

Nit: move this second, it's very rare that users populate it. It gets internally generated during downsampling, mostly.

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/time-series-data-stream-tsds.md

    
              Due to the cumulative nature of counter fields, the following aggregations are supported and expected to provide meaningful results with the `counter` field: `rate`, `histogram`, `range`, `min`, `max`, `top_metrics` and `variable_width_histogram`. In order to prevent issues with existing integrations and custom dashboards, we also allow the following aggregations, even if the result might be meaningless on counters: `avg`, `box plot`, `cardinality`, `extended stats`, `median absolute deviation`, `percentile ranks`, `percentiles`, `stats`, `sum` and `value count`.

              ::::

              :   A cumulative metric that only monotonically increases or resets to `0` (zero). For example, a count of errors or completed tasks.

Contributor

kkrik-es Oct 1, 2025

, resetting when a serving process restarts.

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/time-series-data-stream-tsds.md

    
              #### `_tsid` metadata field [tsid]

              The `_tsid` is an automatically generated object containing the document’s dimensions. It's intended for internal {{es}} use, so in most cases you won't need to work with it.

Contributor

kkrik-es Oct 1, 2025

Since this is defined here properly, let's skip the reference at the top.

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/time-series-data-stream-tsds.md

    
              {{es}} uses [compression algorithms](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#index-codec) to compress repeated values. This compression works best when repeated values are stored near each other — in the same index, on the same shard, and side-by-side in the same shard segment.

              Most time series data contains repeated values. Dimensions are repeated across documents in the same time series. The metric values of a time series may also change slowly over time.

              - You **can't** query or update the internal `_tsid` field.

Contributor

kkrik-es Oct 1, 2025

I'd skip these 3 points and just keep the last one, to highlight why it should not be used in queries.

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/set-up-tsds.md

    
              - **Index patterns:** One or more wildcard patterns matching the name of your TSDS, such as `weather-sernsors-*`. For best results, use the [data stream naming scheme](/reference/fleet/data-streams.md#data-streams-naming-scheme).

              - **Data stream object:** The template must include `"data_stream": {}`.

              - **Time series mode:** Set `index.mode: time_series`.

              - **Field mappings:** Define at least one `keyword` dimension field and typically one or more metric fields:

Contributor

kkrik-es Oct 1, 2025

Suggested change

      
            - **Field mappings:** Define at least one `keyword` dimension field and typically one or more metric fields:
          
            - **Field mappings:** Define at least one dimension field and typically one or more metric fields:

Contributor

kkrik-es Oct 1, 2025

Dimensions are no longer required to be keyword fields.

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/set-up-tsds.md

    
              - **Time series mode:** Set `index.mode: time_series`.

              - **Field mappings:** Define at least one `keyword` dimension field and typically one or more metric fields:

                  - To define a dimension, set `time_series_dimension` to `true`. Dimension fields like `counter` only increase over time. For more details, refer to [Dimensions](/manage-data/data-store/data-streams/time-series-data-stream-tsds.md#time-series-dimension).

                  - To define a metric, use the `time_series_metric` mapping parameter. Metric fields like `gauge` can increase or decrease over time. For more details, refer to [Metrics](/manage-data/data-store/data-streams/time-series-data-stream-tsds.md#time-series-metric).

Contributor

kkrik-es Oct 1, 2025

I think we should either mention counters too, or just stick to the cross reference for more details.

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/set-up-tsds.md

    
              - **Data stream object:** The template must include `"data_stream": {}`.

              - **Time series mode:** Set `index.mode: time_series`.

              - **Field mappings:** Define at least one `keyword` dimension field and typically one or more metric fields:

                  - To define a dimension, set `time_series_dimension` to `true`. Dimension fields like `counter` only increase over time. For more details, refer to [Dimensions](/manage-data/data-store/data-streams/time-series-data-stream-tsds.md#time-series-dimension).

Contributor

kkrik-es Oct 1, 2025

I wonder if we should be referencing pass-through fields here. Dimensions are often defined dynamically, so the pass-through object can be used as a dimension container to simplify definitions. See also:

https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/passthrough#passthrough-dimensions

gmarouli reviewed

View reviewed changes

manage-data/data-store/data-streams/set-up-tsds.md

    
              :::{dropdown} Create an ILM policy

              ## Create an index lifecycle policy [tsds-ilm-policy]

              If you're using {{stack}}, {{ilm-init}} can help you manage a time series data stream's backing indices. {{ilm-init}} requires an index lifecycle policy.

Contributor

gmarouli Oct 1, 2025

I am not in favour of adding ILM here for the following reasons:

Not available in serverless, meaning that for serverless we have provided no way of automating lifecycle management.
Too verbose, I understand it's an optional step, but still it's the first step we show.

I think the example should use data stream lifecycle and propose ILM as an alternative and reference it's documentation for more info.

What do you think?

gmarouli reviewed

View reviewed changes

manage-data/data-store/data-streams/set-up-tsds.md

    
                  ```

              You can convert an existing regular data stream to a TSDS. Follow these steps:

              1. Update your existing index template to include time series settings. Also update your index lifecycle policy (if any) and component templates (if any).

Contributor

gmarouli Oct 1, 2025

The second sentence here is a vague, when I read I have no idea what kind of updates it is talking about. Maybe something like:

Update your existing index template and/or component templates (if any) to include time series settings.

It is reasonable to bundle the index template and component templates together, because depending on the setup it might be enough to update only one component template.

About the ILM policy, I have no idea what updates it is referring to, that's why I would suggest to remove it.

gmarouli reviewed

View reviewed changes

manage-data/data-store/data-streams/set-up-tsds.md

    
              After creating the index template, you can create a time series data stream by [indexing a document](use-data-stream.md#add-documents-to-a-data-stream). The TSDS is created automatically when you index the first document, as long as the index name matches the index template pattern. You can use a bulk API request or a POST request.

              :::{important}

              To test the following `_bulk` example, update the timestamps to within three hours of your current time. Data added to a TSDS must fit the [accepted time range](/manage-data/data-store/data-streams/time-bound-tsds.md#tsds-accepted-time-range).

Contributor

gmarouli Oct 1, 2025

Suggested change

      
            To test the following `_bulk` example, update the timestamps to within three hours of your current time. Data added to a TSDS must fit the [accepted time range](/manage-data/data-store/data-streams/time-bound-tsds.md#tsds-accepted-time-range).
          
            To test the following `_bulk` example, update the timestamps to within two hours of your current time. Data added to a TSDS must fit the [accepted time range](/manage-data/data-store/data-streams/time-bound-tsds.md#tsds-accepted-time-range).

gmarouli reviewed

View reviewed changes

manage-data/data-store/data-streams/time-bound-tsds.md

    
              Only data that falls within this range is indexed.

              To check the accepted time range for writing to a TSDS, use the [get data stream API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-get-data-stream):

Contributor

gmarouli Oct 1, 2025

We are missing a nuance here, this API responds with the time range supported by a TSDB, but the writes are not necessarily accepted in this time range, if a backing index is marked read-only for example, they will be rejected.

Not sure how to rephrase this, potentially it could but it's not a given.

gmarouli reviewed

View reviewed changes

manage-data/data-store/data-streams/time-bound-tsds.md

    
              ```

              ::::{tip}

              These {{ilm-init}} actions mark the source index as read-only or prevent writes for performance reasons:

Contributor

gmarouli Oct 1, 2025

I think we should rephrase this because some actions do not fit this explanation. Maybe something along the lines:

The following actions influence the writable time range of a TSDS, either because they make a backing index read-only or remove it:

gmarouli reviewed

View reviewed changes

manage-data/data-store/data-streams/time-bound-tsds.md

    
               - [Force merge](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-forcemerge.md) 

               - [Read only](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-readonly.md)

               - [Searchable snapshot](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-searchable-snapshot.md) 

               - [Shrink](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-shrink.md)

Contributor

gmarouli Oct 1, 2025

This action could revert the read-only status at the end of the action. Not sure if this is too much information here, but I thought to share it.

kkrik-es reviewed

View reviewed changes

manage-data/data-store/data-streams/time-bound-tsds.md

    
              ::::{tip}

              These {{ilm-init}} actions mark the source index as read-only or prevent writes for performance reasons:

               - [Delete](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-delete.md) 

               - [Downsample](elasticsearch://reference/elasticsearch/index-lifecycle-actions/ilm-downsample.md)

Contributor

kkrik-es Oct 1, 2025

Nit: Move Downsample to the top, it's the most relevant here.

kkrik-es reviewed

View reviewed changes

Contributor

kkrik-es left a comment

This is super nice.

kkrik-es removed the request for review from felixbarny

October 1, 2025 17:18

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet