Problem
No runtime metrics: we can’t observe latency spikes or RPC error rates.
Background
prom-client is the de‑facto Prometheus client for Node.js; Grafana Cloud ships an out‑of‑the‑box Node.js dashboard JSON.
Proposed Solution
- Add
/metrics route exposing default prom-client registry + custom histograms:
- http_request_duration_seconds
- rpc_errors_total
- Docker‑Compose update: Prometheus + Grafana services.
- Import Grafana dashboard ID 11159; include JSON in
ops/.
Acceptance Criteria
curl /metrics returns ≥ 10 series.
- Grafana panel shows <200 ms p95 latency under load.
- CI job fails if
/metrics route breaks.
Labels
chore observability