As we continue to upgrade our infrastructure over these few weeks, we have improved our error handling and auto-recovery on our NGINX load balancers. What this means is if there are network or server problems, the system we have in place will automatically repair itself with near zero disruption to users.
We have had in place a HARD failover. Meaning when we detected 3 errors we did a hard shutdown of the server and replaced it with another server automatically (we now have 3 NGINX servers handling the front end load balancing).
We have replaced that with a SOFT failover. Meaning we will remove that NGINX load balance server from the list of IPs returned for runsignup.com. While we may get some false positives, the good part is that changing the DNS should have no impact on users and should switch some people off the server before the HARD handler kicks in, which could impact users.
This is likely more complex than most of our customers care to understand. Which is exactly the point of us taking care of the technology for you…