[Solved] Troubleshooting Kubernetes Application Istio Service Mesh Upstream Connect Error

Troubleshooting Kubernetes Application Istio Service Mesh Upstream Connect Error
Troubleshooting Kubernetes Application Istio Service Mesh Upstream Connect Error

Problem

A microservice application running in Kubernetes with Istio service mesh encountered a connectivity issue between services. Specifically:

  • Requests to existing resources (/resources/1) failed
  • Requests to non-existing resources (/resources/133) worked correctly
  • Error message: “upstream connect error or disconnect/reset before headers. reset reason: connection termination”

istio Architecture

Solution

After thorough troubleshooting, the root cause was identified in the gateway application’s code:

  1. The gateway application was directly returning the ResponseEntity from an external API call without modification:
@GetMapping("/some/path")
public ResponseEntity<Response> getResponse(@RequestParam(name = "id", required = false) Integer id) {
    return externalApiClient.get(id); // Directly returning ResponseEntity from Feign client
}
  1. This approach corrupted the response headers expected by Istio-proxy (Envoy).
  1. The solution involved wrapping the response body with a newly created Response object:
@GetMapping("/some/path")
public ResponseEntity<Response> getResponse(@RequestParam(name = "id", required = false) Integer id) {
    return ResponseEntity.ok(externalApiClient.get(id).getBody()); // Creating a fresh ResponseEntity
}

Key points:

  • Directly proxying external request responses can cause issues with service mesh components
  • Wrapping external API responses in custom Response objects resolves header corruption issues
  • Thorough network sniffing and code inspection were crucial in identifying the problem

Best Practices

  • Do not directly proxy responses of external requests to the caller
  • Wrap external API responses with custom response objects
  • Ensure observability in microservice applications
  • Be aware that similar issues can occur in other languages beyond Java

This solution resolved the upstream connect error and ensured proper communication between services in the Istio service mesh environment.

Documentation

Author

0 Shares:
You May Also Like
Make Kubernetes simpler! 8 AI Tools You Must Know
Read More

Make Kubernetes simpler! 8 AI Tools You Must Know

Table of Contents Hide OverviewK8sGPTInstallPrerequisiteskubectl-aiInstall via Homebrew:Install via Krew:DemoKoPylotFunctionOperating principleKopilotInstallKubectl-GPTInstallPrerequisitesKube-CopilotInstallSet operationKubernetes ChatGPT botDemoAppilotAuthor Overview Kubernetes users inevitably face…