Sauce Labs Maintenance Windows for Sauce Labs
We are investigating an issue with a third-party provider that is causing intermittent "Connection timed out", "pool communication" and "Unknown error while proxying appium request" errors when running tests in our EU and US datacenters. We will be performing emergency maintenance to our EU-Central and US-West datacenters this week to address this issue. Emergency maintenance windows will be posted to this page.
2022-December-16 Resolved Service Incident 2
Incident Report for Sauce Labs


Friday December 16th 2022, 20:51 - 21:31 UTC

What happened:

All tests waiting to be assigned to a virtual device during the incident were marked as failed. The error rate was ~70-80%, but primarily impacted customers requesting specific test devices.

Why it happened:

Demand for macOS increased to a point where it passed what was available. As this happened, new tests began to queue up which triggered the clearing of the new jobs queue resulting in all of the tests waiting in this queue being marked as failed. While the underlying issue was with macOS and iOS capacity it impacted all test types as the new test queue (which is shared) was backed up.

How we fixed it:

Clearing new tests queue restored the system from starvation; no additional action is usually required. In some cases, even after removing the new tests queue, the starvation comes back, and another clearing is performed.

What we are doing to prevent it from happening again:

We are looking at ways to increase our capacity specifically for macOS and iOS. We are also looking into ways that jobs can be cleared from queues by image name or platform, rather than clear the whole queue.

Posted Jan 24, 2023 - 10:06 UTC

Between 20:53 UTC and 21:14 UTC we experienced a spike in job errors on our Virtual Device cloud in our US West Data Center. All services are fully operational.
Posted Dec 16, 2022 - 23:46 UTC