As always, we report any availability issues in the spirit of transparency.
We did a database migration early this morning to better support International customers. The migration took 3 minutes and 29 seconds to complete. During this time, the site functioned fully with two caveats. If someone was trying to check out at that time, the “running man” icon would have held on that page for potentially 2 minutes. Also, if someone was creating a new account, or on the first page of registration, they would have seen a delay. There was one user who had to start their registration over during that process.
Database migrations are a normal part of a system that continuously improves, like RunSignUp. We have worked hard to build our infrastructure to make these migrations while not stopping our service. This means we do not have to stop the whole service, except on rare occasions.
Also, yesterday we had a failing memcache server. Thanks to our redundancy there was no loss of service or transactions. Users during the middle of the day may have noticed our response time got worse by about 0.3 seconds. Our alerting system worked well and we were able to replace the failing component in the Cloud. Again, with no loss of service.
Year to date, we have had a 2 minute partial issue and today’s 4 minute data migration. So depending on how you measure that is either 100% up time except for less than 5 users who had delays or had to click refresh on their browsers. If that is calculated as downtime, then it is 99.9982% uptime.