Skip to content

Connection refused error when calling service endpoint #273

@WJay-tec

Description

@WJay-tec

Problem

Calling Clusterset service endpoint after deleting a pod for that service will result in connection refused error.

Step to reproduce connection refused error

  1. Have pods in 2 clusters → (for example stag-eks , stag-eks-2)
  2. Create a ServiceExport for the service you are trying to expose in stag-eks
  3. Step 2 will automatically create a ServiceImport on both clusters
  4. Create a dummy pod in stag-eks-2, and exec into it. Run a curl command to the ClusterSet endpoint that was exported in step 2 (The curl command will successfully obtain a response)
  5. Delete the service pod u created in step 2 in stag-eks
  6. Wait for the pod to get recreated, and run the curl command again (which will get a connection refused error)

Steps to resolve the issue

  1. Delete ServiceImport in stag-eks-2 (where the caller is from)
  2. Rerun the curl command in the dummy pod in stag-eks-2, and u will get a successful response

Based on my current observation, it seems like coreDNS is not getting the latest pod IP and is still resolving to the old pod ip.
When the ServiceImport is recreated, it started to work fine again probably because the coreDNS record is updated due to the recreation.

Its also worth to add, that removing readinessProbe from the deployment manifest fixes the issue mentioned above (which i dont really understand how that fixes it)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions