Skip to content

Conversation

spantaleev
Copy link
Contributor

In pgStatWalReceiverQueryTemplate, the order of the columns (when hasFlushedLSN == true) is:

  • ...
  • receive_start_lsn
  • flushed_lsn
  • receive_start_tli
  • ...

However, columns were scanned in this order:

  • ...
  • receive_start_lsn -> receiveStartLsn
  • receive_start_tli -> flushedLsn (!)
  • flushed_lsn -> receiveStartTli (!)
  • ...

This incorrect hydration of variables also manifests as swapped values for the pg_stat_wal_receiver_flushed_lsn and pg_stat_wal_receiver_receive_start_tli metrics.

This seems to be a bug that has existed since the initial implementation:

In this patch, I'm:

  • fixing the .Scan(), so that it hydrates variables in the correct order

  • adjusting the order in which metrics are pushed out to the channel, to follow the order we consume them in (.., receive_start_lsn, flushed_lsn, receive_start_tli, ..)

  • adjusting the walreceiver tests, to follow the new order (which matches .Scan())

  • fixing a small identation issue in pgStatWalReceiverQueryTemplate

@cristiangreco
Copy link
Contributor

Hi @spantaleev! Do you mind fixing the DCO check please?

…lector

In `pgStatWalReceiverQueryTemplate`, the order of the columns (when `hasFlushedLSN == true`) is:

- ...
- `receive_start_lsn`
- `flushed_lsn`
- `receive_start_tli`
- ...

However, columns were scanned in this order:

- ...
- `receive_start_lsn` -> `receiveStartLsn`
- `receive_start_tli` -> `flushedLsn` (!)
- `flushed_lsn` -> `receiveStartTli` (!)
- ...

This incorrect hydration of variables also manifests as swapped values for the
`pg_stat_wal_receiver_flushed_lsn` and `pg_stat_wal_receiver_receive_start_tli` metrics.

This seems to be a bug that has existed since the initial implementation:

- 2d7e152
- prometheus-community#844

In this patch, I'm:

- fixing the `.Scan()`, so that it hydrates variables in the correct order

- adjusting the order in which metrics are pushed out to the channel,
  to follow the order we consume them in
  (.., `receive_start_lsn`, `flushed_lsn`, `receive_start_tli`, ..)

- adjusting the walreceiver tests, to follow the new order (which matches .`Scan()`)

- fixing a small identation issue in `pgStatWalReceiverQueryTemplate`

Signed-off-by: Slavi Pantaleev <slavi@devture.com>
@spantaleev spantaleev force-pushed the fix-walreceiver-swapped-values branch from b1cc9bc to 024c1fd Compare September 29, 2025 11:07
@spantaleev
Copy link
Contributor Author

I've fixed the sign-off. Sorry for missing that the first time around!

@cristiangreco cristiangreco merged commit ef2736e into prometheus-community:master Sep 29, 2025
11 checks passed
cristiangreco added a commit that referenced this pull request Sep 29, 2025
* [BUGFIX] Fix swapped `flushedLsn` and `receiveStartTli` for `wal_receiver` collector by @spantaleev in #1198
* [BUGFIX] Fix superfluous semicolon breaking query in `process_idle` by @sysadmind in #1197 and #1201
cristiangreco added a commit that referenced this pull request Sep 29, 2025
* [BUGFIX] Fix swapped `flushedLsn` and `receiveStartTli` for `wal_receiver` collector by @spantaleev in #1198
* [BUGFIX] Fix superfluous semicolon breaking query in `process_idle` by @sysadmind in #1197 and #1201

Signed-off-by: Cristian Greco <cristian@regolo.cc>
sysadmind pushed a commit that referenced this pull request Sep 29, 2025
* [BUGFIX] Fix swapped `flushedLsn` and `receiveStartTli` for `wal_receiver` collector by @spantaleev in #1198
* [BUGFIX] Fix superfluous semicolon breaking query in `process_idle` by @sysadmind in #1197 and #1201

Signed-off-by: Cristian Greco <cristian@regolo.cc>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants