You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Previously approved but merge into the wrong base:
#2
## Background
The current behavior of the exporter is to open a new database
connection on every scrape.
First, here is where the default metrics collector and new-style
collectors are registered:
https://github.com/planetscale/postgres_exporter/blob/198454cc9e56141d5cc422149755fe8e80b3eeea/cmd/postgres_exporter/main.go#L126https://github.com/planetscale/postgres_exporter/blob/198454cc9e56141d5cc422149755fe8e80b3eeea/cmd/postgres_exporter/main.go#L143
Prometheus may run these collectors in parallel. Additionally, the
`collectors` package is set up to support concurrent scrapes:
https://github.com/planetscale/postgres_exporter/blob/198454cc9e56141d5cc422149755fe8e80b3eeea/collector/collector.go#L171-L176
No doubt it's useful for some to have the default and new-style metrics
be collected concurrently, and to be able to support concurrent srapes.
## Changes
But for PlanetScale, our metric collection system does not scrape the
same endpoint concurrently, so concurrent scrapes aren't useful for us.
We can also live without whatever time is gained by having default and
new-style metrics be collected concurrently: our scrape timeout is 10s,
and we expect metrics to be collected much faster than that. If not,
then we probably have other problems we need to look at.
Additionally we want to consume as few customer connections as possible.
So having the option of using a single, shared connection between
default and new-style metrics is good for us. Additionally, if customers
have used up all their connections, having each scrape create a new db
conn might mean we can't produce metrics. So, having the option to use a
single, shared, _persistent_ connection is doubly useful.
## Validation
Create a role with a connection limit of 1:
```postgresql
CREATE ROLE postgres_exporter WITH LOGIN CONNECTION LIMIT 1;
GRANT pg_monitor TO postgres_exporter;
GRANT CONNECT ON DATABASE postgres TO postgres_exporter;
```
### With `--no-concurrent-scrape`
Start the exporter with concurrent scraping disabled:
```bash
DATA_SOURCE_NAME="postgresql://postgres_exporter@127.0.0.1:5432/postgres?sslmode=disable" ./postgres_exporter --no-concurrent-scrape --log.level=info
```
Scraping metrics shows no errors in output.
### With `--concurrent-scrape`
Repeat with `--concurrent-scrape`, shows this error:
> time=2025-09-06T20:56:44.568-04:00 level=ERROR source=collector.go:195
msg="Error opening connection to database" err="error querying
postgresql version: pq: too many connections for role
\"postgres_exporter\""
### Connection resilience
Verified that `--no-concurrent-scrape` is resilient to Postgres
connection resets. After a connection reset, the exporter will log an
error:
> time=2025-09-06T22:44:06.172-04:00 level=ERROR source=collector.go:180
msg="Error creating instance" err="driver: bad connection"
But the connection will be recreated and the scrape (or the next scrape
anyway) will succeed. Behavior seems similar to `master`.
---------
Signed-off-by: Max Englander <max@planetscale.com>
// New optimized behavior: share connection from server with resilience
54
+
factory=func() (*collector.Instance, error) {
55
+
server, err:=exporter.servers.GetServer(dsn)
56
+
iferr!=nil {
57
+
returnnil, err
58
+
}
59
+
60
+
inst, err:=collector.NewInstance(dsn)
61
+
iferr!=nil {
62
+
returnnil, err
63
+
}
64
+
65
+
err=inst.SetupWithConnection(server.db)
66
+
iferr!=nil {
67
+
returnnil, err
68
+
}
69
+
70
+
returninst, nil
71
+
}
72
+
}
73
+
74
+
// Create collector with factory
75
+
pe, err:=collector.NewPostgresCollector(
76
+
logger,
77
+
excludedDatabases,
78
+
factory,
79
+
[]string{},
80
+
collector.WithTimeout(scrapeTimeout),
81
+
)
82
+
iferr!=nil {
83
+
logger.Warn("Failed to create PostgresCollector", "err", err.Error())
84
+
return
85
+
}
86
+
87
+
prometheus.MustRegister(pe)
88
+
}
89
+
35
90
var (
36
91
c= config.Handler{
37
92
Config: &config.Config{},
@@ -50,6 +105,7 @@ var (
50
105
includeDatabases=kingpin.Flag("include-databases", "A list of databases to include when autoDiscoverDatabases is enabled (DEPRECATED)").Default("").Envar("PG_EXPORTER_INCLUDE_DATABASES").String()
51
106
metricPrefix=kingpin.Flag("metric-prefix", "A metric prefix can be used to have non-default (not \"pg\") prefixes for each of the metrics").Default("pg").Envar("PG_EXPORTER_METRIC_PREFIX").String()
52
107
scrapeTimeout=kingpin.Flag("scrape-timeout", "Maximum time for a scrape to complete before timing out (0 = no timeout)").Default("0").Envar("PG_EXPORTER_SCRAPE_TIMEOUT").Duration()
108
+
concurrentScrape=kingpin.Flag("concurrent-scrape", "Use dedicated instance for collector allowing concurrent scrapes (default: true for backward compatibility)").Default("true").Envar("PG_EXPORTER_CONCURRENT_SCRAPE").Bool()
53
109
logger=promslog.NewNopLogger()
54
110
)
55
111
@@ -133,18 +189,7 @@ func main() {
133
189
dsn=dsns[0]
134
190
}
135
191
136
-
pe, err:=collector.NewPostgresCollector(
137
-
logger,
138
-
excludedDatabases,
139
-
dsn,
140
-
[]string{},
141
-
collector.WithTimeout(*scrapeTimeout),
142
-
)
143
-
iferr!=nil {
144
-
logger.Warn("Failed to create PostgresCollector", "err", err.Error())
0 commit comments