Skip to content

[🐛 Bug]: Health check making nodes unavailable in Kubernetes, getting error - "Could not start a new session. Response code 500. Message" #2812

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ketanb02 opened this issue Apr 30, 2025 · 4 comments

Comments

@ketanb02
Copy link

What happened?

This behaviour is only observed when ran on Kubernetes using (deployment.yaml) and not observed when ran with images on normal linux servers.

We have observed that the health check for Selenium grid - starting from v4.29 and onwards is making the nodes unavailable during the health check causing error while running test
"org.openqa.selenium.SessionNotCreatedException: Could not start a new session. Response code 500. Message: Could not start a new session. java.net.ConnectException" and
"java.lang.NullPointerException: Cannot invoke "org.openqa.selenium.TakesScreenshot.getScreenshotAs(org.openqa.selenium.OutputType)" because "driver" is null"

Command used to start Selenium Grid with Docker (or Kubernetes)

kubectl apply -f hub.yaml

Relevant log output

14:49:45.487 INFO [LocalDistributor.add] - Added node b68b3b27-0e5c-40c4-82af-59dd1ecc4212 at https://node-chrome:5151. Health check every 120s
14:50:17.284 INFO [GridModel.setAvailability] - Switching Node b70e99a7-5d2c-45ec-b261-9f1d87d79744 (uri: https://node-chrome:5454) from UP to DOWN
14:52:17.265 INFO [GridModel.setAvailability] - Switching Node 7570b122-08d9-43f4-aff8-3276defc251d (uri: https://node-chrome:5252) from UP to DOWN

Operating System

Linux

Docker Selenium version (image tag)

4.31.0-20250414

Selenium Grid chart version (chart version)

No response

Copy link

@ketanb02, thank you for creating this issue. We will troubleshoot it as soon as we can.


Info for maintainers

Triage this issue by using labels.

If information is missing, add a helpful comment and then I-issue-template label.

If the issue is a question, add the I-question label.

If the issue is valid but there is no time to troubleshoot it, consider adding the help wanted label.

If the issue requires changes or fixes from an external project (e.g., ChromeDriver, GeckoDriver, MSEdgeDriver, W3C), add the applicable G-* label, and it will provide the correct link and auto-close the issue.

After troubleshooting the issue, please add the R-awaiting answer label.

Thank you!

@VietND96
Copy link
Member

VietND96 commented May 8, 2025

Hi, I think you should enable SE_LOG_LEVEL=FINE in both Hub and Node to see few more debug logs behind the health checks.

@ketanb02
Copy link
Author

ketanb02 commented May 8, 2025

Hello, No logs related to health checks generated even after adding --log-level FINE. Giving below a ui console screenshot, just for reference on how the node are getting grayed out in between.

Image

Logs with FINE at hub end. Nothing at node end.

10:59:58.971 DEBUG [HttpTracing.inject] - Injecting (GET) /status into org.openqa.selenium.remote.tracing.empty.NullContext@31a66910 at org.openqa.selenium.grid.node.remote.RemoteNode:252
10:59:58.972 DEBUG [JdkHttpClient.execute0] - Executing request: (GET) /status
10:59:58.980 DEBUG [JdkHttpClient.execute0] - Ending request (GET) /status in 7ms
10:59:58.980 DEBUG [LocalDistributor.updateNodeAvailability] - Health check result for https://node-chrome:5252 was DOWN
10:59:58.980 INFO [GridModel.setAvailability] - Switching Node 7570b122-08d9-43f4-aff8-3276defc251d (uri: https://node-chrome:5252) from UP to DOWN

This is happening only on Kubernetes setup and on 4.29 onwards... till 4.28.1 it was working fine.

@ketanb02
Copy link
Author

ketanb02 commented May 8, 2025

Just added a comparison screenshot for v4.28 and later where we are facing issue

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants