Date: Nov 08, 2017
Time: 11:30AM - 12:40PM PST
There were intermittent failures starting new Sauce Connect tunnels.
Why Did It Happen?
A flaw in our internal monitoring led to a miscalculation of our available capacity for new tunnels. Requests for new tunnels were routed to resources already under load, causing tunnels to start more slowly than normal and eventually to not start at all. The miscalculation was triggered when some resources were removed and then restored to our cloud.
How Was It Resolved?
We corrected the miscalculation of tunnel capacity in our internal monitoring. Requests for new tunnels were then automatically routed to the correct resources and began to start at normal speeds again.
What Are We Doing To Ensure It Never Happens Again?
We’re improving our monitoring system by adding methods to identify miscalculations of our service’s tunnel capacity, particularly when additional resources are provisioned.