2017-November 30 Service Incident
Incident Report for Sauce Labs Inc

Date: November 30, 2017
Time: 7:04 - 8:06 AM PST

What Happened:
Sauce Labs’ VM capacity dropped, causing high wait times for test VMs.

Why Did It Happen:
A bug in the system used to predict VM demand caused the management system responsible for pre-booting VMs to stop requesting new resources, which in turn caused a drop in available capacity.

What did we do to fix it:
We manually set the VM demand values and allowed our cloud to catch up.

What are we doing to prevent it from happening again:
We've corrected the initial bug in our prediction service, as well as hardened the management system so that it is both easier to debug and reacts more quickly to bad inputs.

Posted 6 days ago. Dec 07, 2017 - 10:25 PST

Resolved
Wait times have returned to normal levels. All services are fully operational
Posted 13 days ago. Nov 30, 2017 - 08:24 PST
Monitoring
We've deployed a fix which should handle the wait times of our VMs and we're continuing to monitor the fix
Posted 13 days ago. Nov 30, 2017 - 08:00 PST
Investigating
We are seeing long wait times for automated and manual testing and are taking immediate actions to rectify
Posted 13 days ago. Nov 30, 2017 - 07:17 PST
This incident affected: Sauce Automated and Sauce Manual.