HTTP server randomly closes, offers vague reason, refuses to elaborate further #1000

grepwood · 2025-03-06T10:47:36Z

Describe the bug
I'm running nginx-prometheus-exporter as a container sitting next to my nginx container. The container with NPE randomly dies for no reason. It just logs:

{"time": actual time,"level":"INFO","source":"exporter.go:217","msg":"shutting down"}
{"time": actual time,"level":"INFO","source":"exporter.go:208","msg":"HTTP server closed","error":"http: server closed"}

Despite the fact that NPE was started with --log.level=debug, there's really nothing elaborated on why the HTTP server shut down.

To reproduce
Steps to reproduce the behavior:

Deploy NPE with --log.level=debug.
Wait 4 minutes, or 4 hours, or 12 hours.

Expected behavior
NPE should explain why it shut down so that I actually fix it.

Your environment

Version of the Prometheus exporter - 1.4.1
Version of Docker/Kubernetes - not relevant
[if applicable] Kubernetes platform (e.g. Mini-kube or GCP): Mirantis
Using NGINX or NGINX Plus: NGINX

The text was updated successfully, but these errors were encountered:

nginx-bot · 2025-03-06T10:47:40Z

Hi @grepwood! Welcome to the project! 🎉

Thanks for opening this issue!
Be sure to check out our Contributing Guidelines and the Issue Lifecycle while you wait for someone on the team to take a look at this.

diogokiss · 2025-05-08T21:08:45Z

I'm facing the exact same issue for a couple of weeks now.
I'm also running it in Kubernetes with 3 pods, in a Production environment.
Here are the data I managed to collect.

What I managed to observe is the following by logging in the Nginx container's shell.
Even though the Nginx /stub_status endpoint responds normally with the metrics,
I'm not able to get a reply from the Nginx Prometheus Exporter container. The request just hangs indefinitely.

(I interrupted the execution after more than 1 min.)

/ $ date && time curl -v 127.0.0.1:8000/stub_status && echo && echo && echo && date && time curl -v 127.0.0.1:9113/metrics; date
Thu May  8 21:03:24 UTC 2025

*   Trying 127.0.0.1:8000...
* Connected to 127.0.0.1 (127.0.0.1) port 8000
* using HTTP/1.x
> GET /stub_status HTTP/1.1
> Host: 127.0.0.1:8000
> User-Agent: curl/8.12.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 200 OK
< Server: nginx/1.26.3
< Date: Thu, 08 May 2025 21:03:24 GMT
< Content-Type: text/plain
< Content-Length: 107
< Connection: keep-alive
<
Active connections: 36
server accepts handled requests
 313 313 14525
Reading: 0 Writing: 6 Waiting: 30
* Connection #0 to host 127.0.0.1 left intact
real	0m 0.00s
user	0m 0.00s
sys	0m 0.00s



Thu May  8 21:03:24 UTC 2025

*   Trying 127.0.0.1:9113...
* Connected to 127.0.0.1 (127.0.0.1) port 9113
* using HTTP/1.x
> GET /metrics HTTP/1.1
> Host: 127.0.0.1:9113
> User-Agent: curl/8.12.1
> Accept: */*
>
* Request completely sent off
^C Command terminated by signal 2
real	1m 24.84s
user	0m 0.00s
sys	0m 0.00s

Thu May  8 21:04:49 UTC 2025

Logs from the containers

1746735622753	time=2025-05-08T20:20:22.753Z level=INFO source=exporter.go:217 msg="shutting down"
1746735622753	time=2025-05-08T20:20:22.753Z level=INFO source=exporter.go:208 msg="HTTP server closed" error="http: Server closed"
1746735622872	time=2025-05-08T20:20:22.872Z level=INFO source=exporter.go:123 msg=nginx-prometheus-exporter version="(version=1.4.2, branch=HEAD, revision=ced6fda825f88077debfacab8d82536ce502bb17)"
1746735622872	time=2025-05-08T20:20:22.872Z level=INFO source=exporter.go:124 msg="build context" build_context="(go=go1.24.2, platform=linux/amd64, user=goreleaser, date=2025-04-28T15:24:56Z, tags=unknown)"
1746735622875	time=2025-05-08T20:20:22.875Z level=INFO source=tls_config.go:347 msg="Listening on" address=[::]:9113
1746735622875	time=2025-05-08T20:20:22.875Z level=INFO source=tls_config.go:350 msg="TLS is disabled." http2=false address=[::]:9113
1746735828041	time=2025-05-08T20:23:48.041Z level=INFO source=exporter.go:217 msg="shutting down"
1746735828041	time=2025-05-08T20:23:48.041Z level=INFO source=exporter.go:208 msg="HTTP server closed" error="http: Server closed"
1746735828155	time=2025-05-08T20:23:48.155Z level=INFO source=exporter.go:123 msg=nginx-prometheus-exporter version="(version=1.4.2, branch=HEAD, revision=ced6fda825f88077debfacab8d82536ce502bb17)"
1746735828155	time=2025-05-08T20:23:48.155Z level=INFO source=exporter.go:124 msg="build context" build_context="(go=go1.24.2, platform=linux/amd64, user=goreleaser, date=2025-04-28T15:24:56Z, tags=unknown)"
1746735828158	time=2025-05-08T20:23:48.158Z level=INFO source=tls_config.go:347 msg="Listening on" address=[::]:9113
1746735828158	time=2025-05-08T20:23:48.158Z level=INFO source=tls_config.go:350 msg="TLS is disabled." http2=false address=[::]:9113
1746735848029	time=2025-05-08T20:24:08.029Z level=INFO source=exporter.go:217 msg="shutting down"
1746735848029	time=2025-05-08T20:24:08.029Z level=INFO source=exporter.go:208 msg="HTTP server closed" error="http: Server closed"
1746735848194	time=2025-05-08T20:24:08.194Z level=INFO source=exporter.go:123 msg=nginx-prometheus-exporter version="(version=1.4.2, branch=HEAD, revision=ced6fda825f88077debfacab8d82536ce502bb17)"
1746735848194	time=2025-05-08T20:24:08.194Z level=INFO source=exporter.go:124 msg="build context" build_context="(go=go1.24.2, platform=linux/amd64, user=goreleaser, date=2025-04-28T15:24:56Z, tags=unknown)"
1746735848198	time=2025-05-08T20:24:08.197Z level=INFO source=tls_config.go:347 msg="Listening on" address=[::]:9113
1746735848198	time=2025-05-08T20:24:08.197Z level=INFO source=tls_config.go:350 msg="TLS is disabled." http2=false address=[::]:9113

Events

Events:
  Warning  Unhealthy                20m                kubelet                  Readiness probe failed: Get "http://240.48.0.181:9113/metrics": EOF
  Normal   Pulled                   20m (x2 over 22m)  kubelet                  Container image "nginx/nginx-prometheus-exporter:1.4.2" already present on machine
  Normal   Started                  20m (x2 over 22m)  kubelet                  Started container nginx-prometheus-exporter
  Normal   Killing                  20m                kubelet                  Container nginx-prometheus-exporter failed liveness probe, will be restarted
  Normal   Created                  20m (x2 over 22m)  kubelet                  Created container: nginx-prometheus-exporter
  Warning  Unhealthy                77s (x7 over 20m)  kubelet                  Liveness probe failed: Get "http://240.48.0.181:9113/metrics": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy                67s (x8 over 20m)  kubelet                  Readiness probe failed: Get "http://240.48.0.181:9113/metrics": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

Events:
  Normal   Started                  19m (x2 over 22m)  kubelet                  Started container nginx-prometheus-exporter
  Normal   Created                  19m (x2 over 22m)  kubelet                  Created container: nginx-prometheus-exporter
  Normal   Pulled                   19m (x2 over 22m)  kubelet                  Container image "nginx/nginx-prometheus-exporter:1.4.2" already present on machine
  Normal   Killing                  19m                kubelet                  Container nginx-prometheus-exporter failed liveness probe, will be restarted
  Warning  Unhealthy                19m                kubelet                  Readiness probe failed: Get "http://240.48.0.73:9113/metrics": EOF
  Warning  Unhealthy                8s (x10 over 19m)  kubelet                  Readiness probe failed: Get "http://240.48.0.73:9113/metrics": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy                8s (x10 over 19m)  kubelet                  Liveness probe failed: Get "http://240.48.0.73:9113/metrics": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

Events:
  Normal   Started                  20m (x2 over 22m)  kubelet                  Started container nginx-prometheus-exporter
  Normal   Created                  20m (x2 over 22m)  kubelet                  Created container: nginx-prometheus-exporter
  Normal   Pulled                   20m (x2 over 22m)  kubelet                  Container image "nginx/nginx-prometheus-exporter:1.4.2" already present on machine
  Normal   Killing                  20m                kubelet                  Container nginx-prometheus-exporter failed liveness probe, will be restarted
  Warning  Unhealthy                20m                kubelet                  Readiness probe failed: Get "http://240.48.0.60:9113/metrics": EOF
  Warning  Unhealthy                83s (x7 over 20m)  kubelet                  Readiness probe failed: Get "http://240.48.0.60:9113/metrics": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
  Warning  Unhealthy                78s (x8 over 20m)  kubelet                  Liveness probe failed: Get "http://240.48.0.60:9113/metrics": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

Configuration

args:
  - --nginx.scrape-uri=http://127.0.0.1:8000/stub_status
  - --log.level=debug
image: nginx/nginx-prometheus-exporter:1.4.2
imagePullPolicy: IfNotPresent
livenessProbe:
  failureThreshold: 3
  httpGet:
    path: /metrics
    port: 9113
    scheme: HTTP
  initialDelaySeconds: 10
  periodSeconds: 5
  successThreshold: 1
  timeoutSeconds: 3
name: nginx-prometheus-exporter
ports:
  - containerPort: 9113
    name: nginx-metrics
    protocol: TCP
readinessProbe:
  failureThreshold: 3
  httpGet:
    path: /metrics
    port: 9113
    scheme: HTTP
  initialDelaySeconds: 10
  periodSeconds: 5
  successThreshold: 1
  timeoutSeconds: 3
resources:
  limits:
    memory: 64Mi
  requests:
    cpu: 100m
    memory: 64Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
  - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
    name: kube-api-access-4pnp9
    readOnly: true

In the meanwhile, we lose the metrics in Grafana.

nginx-bot bot added the community Issues or PRs opened by an external contributor label Mar 6, 2025

vepatel added the backlog candidate Pull requests/issues that are candidates to be backlog items label Mar 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTTP server randomly closes, offers vague reason, refuses to elaborate further #1000

HTTP server randomly closes, offers vague reason, refuses to elaborate further #1000

grepwood commented Mar 6, 2025

nginx-bot bot commented Mar 6, 2025

diogokiss commented May 8, 2025

HTTP server randomly closes, offers vague reason, refuses to elaborate further #1000

HTTP server randomly closes, offers vague reason, refuses to elaborate further #1000

Comments

grepwood commented Mar 6, 2025

nginx-bot bot commented Mar 6, 2025

diogokiss commented May 8, 2025