Sauce Labs Maintenance Windows for Sauce Labs
Sauce Labs will have a four hour previously scheduled maintenance window on Wednesday, July 10th, starting at 04:00 UTC and ending at 08:00 UTC. During this maintenance window, we will be making updates to our infrastructure and services in our US West data center. These maintenance actions may cause portions of the service (including running automated and manual tests) to be unavailable for up to four hours.
2022-December-16 Resolved Service Incident 2
Incident Report for Sauce Labs
Postmortem

Dates:

Friday December 16th 2022, 20:51 - 21:31 UTC

What happened:

All tests waiting to be assigned to a virtual device during the incident were marked as failed. The error rate was ~70-80%, but primarily impacted customers requesting specific test devices.

Why it happened:

Demand for macOS increased to a point where it passed what was available. As this happened, new tests began to queue up which triggered the clearing of the new jobs queue resulting in all of the tests waiting in this queue being marked as failed. While the underlying issue was with macOS and iOS capacity it impacted all test types as the new test queue (which is shared) was backed up.

How we fixed it:

Clearing new tests queue restored the system from starvation; no additional action is usually required. In some cases, even after removing the new tests queue, the starvation comes back, and another clearing is performed.

What we are doing to prevent it from happening again:

We are looking at ways to increase our capacity specifically for macOS and iOS. We are also looking into ways that jobs can be cleared from queues by image name or platform, rather than clear the whole queue.

Posted Jan 24, 2023 - 10:06 UTC

Resolved
Between 20:53 UTC and 21:14 UTC we experienced a spike in job errors on our Virtual Device cloud in our US West Data Center. All services are fully operational.
Posted Dec 16, 2022 - 23:46 UTC