Sauce Labs Maintenance Windows for Sauce Labs
Sauce Labs will have a four hour previously scheduled maintenance window on Wednesday, July 10th, starting at 04:00 UTC and ending at 08:00 UTC. During this maintenance window, we will be making updates to our infrastructure and services in our US West data center. These maintenance actions may cause portions of the service (including running automated and manual tests) to be unavailable for up to four hours.
2023-April-11 Service Incident-2
Incident Report for Sauce Labs
Postmortem

Dates:

Tuesday April 11 2023, 04/11/2023 19:02 UTC - Wednesday April 12 2023, 08:00 UTC

What happened:

Tests running on desktop and virtual mobile devices experienced high wait times and error rates at session start. 

Why it happened:

The data store supporting VM state management reached CPU saturation, causing periodic outages of our VM state management service, each lasting 2-10 minutes in length. As a result, not enough capacity was autoscaled to meet demand.

How we fixed it:

A number of service and infrastructure changes were made to reduce the CPU utilization of the data store.

  • Reducing the number of CPU-heavy queries
  • Splitting some state management workflows to new, separate data stores

What we are doing to prevent it from happening again:

  • Monitoring will be improved to better identify this failure case in the future.
  • We will continue to separate workflows into separate datastores to mitigate against resource saturation and reduce blast radius.
Posted Apr 17, 2023 - 22:38 UTC

Resolved
After taking remedial action, wait times are normal in our US datacenter and all services are fully operational.
Posted Apr 12, 2023 - 00:23 UTC
Update
After taking remedial action, we are seeing improved launch times.
We are currently monitoring.
Posted Apr 12, 2023 - 00:06 UTC
Update
We continue to see elevated wait times in our US datacenter. We continue to investigate.
Posted Apr 11, 2023 - 22:19 UTC
Investigating
We are experiencing elevated wait times in our US datacenter, which may cause tests to be slow to start. We are investigating.
Posted Apr 11, 2023 - 21:48 UTC
This incident affected: Automated Browser Testing (US-West), Automated Virtual Mobile Device Testing (US-West), Live Browser Testing (US-West), and Live Virtual Mobile Device Testing (US-West).