This directory contains a comprehensive test environment for reproducing and analyzing EHOSTUNREACH errors, with scenarios ranging from simple network-level tests to realistic Kubernetes service-to-service failures.
# Run all tests
cd /Users/jonball/dev/prime-microservices/packages/trade-falcon/test/network-errors
./run.sh
# Show options
./run.sh --help
These tests demonstrate fundamental EHOSTUNREACH errors using iptables:
basic-host-unreachable
- Direct connection to IP with host-unreachable rulebasic-net-unreachable
- Direct connection to IP with no route to network
These tests simulate realistic Kubernetes service failures:
k8s-stale-endpoint
- NLB forwarding to terminated pod IP (stale endpoint)k8s-pod-terminating
- Pod received SIGTERM and is shutting down TCP stackk8s-network-policy
- Network policy blocking traffic between namespaces
# Run only basic tests
./run.sh --basic
# Run only Kubernetes scenarios
./run.sh --k8s
# Run a specific scenario
./run.sh --scenario=k8s-stale-endpoint
# Keep container running for debugging
./run.sh --keep-alive
Error Code | Meaning | Linux Error Number | Description |
---|---|---|---|
EHOSTUNREACH | Host Unreachable | 113 | Route to network exists but host is unreachable |
ENETUNREACH | Network Unreachable | 101 | No route exists to the specified network |
In Kubernetes with NLB, EHOSTUNREACH often occurs due to:
-
Endpoint Staleness: Network Load Balancer has outdated endpoints that no longer exist
- The NLB forwards traffic to pod IPs that have been terminated
- This happens during rolling updates or pod evictions
-
Network Policy Issues: Network policies blocking traffic between namespaces
- Connection attempts fail at IP routing level with EHOSTUNREACH
- Often intermittent if policies are applied during traffic
-
Node Networking Issues: When a node has network configuration problems
- Occurs when a node can't route traffic to pods on other nodes
- May happen during node draining/eviction
-
Service Mesh Sidecar Problems: When a service mesh proxy fails
- Istio/Envoy sidecars can cause EHOSTUNREACH when misconfigured
- Traffic routing rules may create unreachable routes
To handle EHOSTUNREACH errors in your application:
-
Retries with Exponential Backoff: Implement retry logic for idempotent operations
async function fetchWithRetry(url, maxRetries = 3, initialDelay = 300) { let retries = 0; while (retries <= maxRetries) { try { return await fetch(url); } catch (err) { if (err.code === 'EHOSTUNREACH' && retries < maxRetries) { await new Promise(r => setTimeout(r, initialDelay * Math.pow(2, retries))); retries++; continue; } throw err; } } }
-
Circuit Breaking: Temporarily stop sending requests to failing endpoints
-
Client-Side Load Balancing: Implement client-side alternatives when load balancers fail
-
Health Checking: Pro-active endpoint health validation before sending requests