Failover Strategy for Endpoints
The Backpac Load Balancer utilizes a comprehensive failover strategy to ensure that users experience minimal downtime and continue to receive a reliable service, even when some blockchain nodes are experiencing errors. Below is an overview of how we handle nodes returning errors, including error detection, retry mechanisms, graceful shutdown, rolling restarts, and error tracking.
Error Detection and Isolation
The Backpac Load Balancer continuously monitors blockchain nodes for any signs of errors, such as:
- Timeouts
- Exceptions
- Rate Limits
- Authentication Failures
- Internal server errors
- Unresponsive nodes
When a node exhibits any of these error patterns, our system immediately isolates it by:
- Routing Traffic to other Nodes: The load balancer dynamically detects nodes returning errors and reroutes traffic to healthy, functional nodes. This ensures that users are always directed to nodes that can handle their requests and prevents downtime.
Retry Logic
For transient errors (temporary or recoverable errors):
- Retry Mechanism: If a node returns an error, a retry is automatically attempted either on the same node or a different, healthy node.
- Error Tolerance: This retry process helps mitigate temporary glitches or network issues and ensures that the service remains responsive even if one node is facing a brief issue.
Error Metrics, Latency and Requests
To proactively manage errors and maintain high service availability, The Backpac Load Balancer employs detailed error, latency and request tracking mechanisms:
Error Metrics: We track key error-related metrics such as error rates, response times, and node downtime. This helps identify trends and potential issues with specific nodes or services.
Latency Metrics: We track the latency of each endpoint based on the RPC method used in the request. This metric is then used to evaluate and determine the best endpoint for subsequent requests of the same RPC method. By routing requests to the lowest-latency endpoint, we ensure optimal performance for each request.
Request Metrics: We count the total number of requests sent to a server, regardless of the RPC method used. This metric helps us monitor overall traffic load for an endpoint, ensuring an even distribution of traffic across endpoints within a Target Group helping the system remain responsive and capable of handling the incoming traffic efficiently.
Summary
Backpac Load Balancer failover strategy for nodes returning errors involves:
Error Detection and Isolation: Immediate redirection of traffic away from nodes that exhibit errors.
Retry Logic: Automatic retries for transient errors to ensure continuity of service.
Metrics: Real-time monitoring and failover to detect and address errors quickly.
By leveraging these strategies, Backpac.xyz helps to ensure minimal disruptions, even in the face of individual node failures.