Sauce Labs Maintenance Windows for Sauce Labs
We are investigating an issue with a third-party provider that is causing intermittent "Connection timed out", "pool communication" and "Unknown error while proxying appium request" errors when running tests in our EU and US datacenters. We will be performing emergency maintenance to our EU-Central and US-West datacenters this week to address this issue. Emergency maintenance windows will be posted to this page.
2022-December-06 Service Incident
Incident Report for Sauce Labs
Postmortem

Dates:

Friday December 6th 2022, 14:53 - Saturday December 7th 17:30 UTC

What happened:

We were experiencing intermittent connection timeouts between internal services in our US region. There was no perceived or reported customer impact, but there may have been a slightly elevated error rate due to these timeouts. 

Why it happened:

An internal service responsible for proxying requests between services was experiencing CPU throttling, causing intermittent latency and timing out some requests. In this particular case, the service was hitting resource limits allocated to the service. 

How we fixed it:

We increased the resource limits for the service and saw requests return to normal. 

What we are doing to prevent it from happening again:

Although our synthetic monitoring and alerting made us aware of this issue, we are putting in better observability and alerting for this particular service at both the application and infrastructure levels. This will give us a more direct indication of where the underlying problem exists.

Posted Jan 10, 2023 - 12:39 UTC

Resolved
After a few hours of monitoring we believe this issue’s impact is negligible and should not affect testing. All services are fully operational.
Posted Dec 06, 2022 - 19:13 UTC
Update
We believe the customer impact of this issue is negligible, but our investigation continues.
Posted Dec 06, 2022 - 16:22 UTC
Investigating
We are investigating intermittent SSL connection timeouts on our ondemand.us-west-1.saucelabs.com endpoint. This may result in errors when attempting to run automated tests in our US-West region. We are investigating.
Posted Dec 06, 2022 - 15:14 UTC
This incident affected: Automated Browser Testing (US-West), Automated Virtual Mobile Device Testing (US-West), Automated Real Device Testing (US-West), and Native Framework Mobile App Testing (US-West).