2018-August-31 Service Incident
Incident Report for Sauce Labs Inc
Postmortem

Date: August 31, 2018
Time: 2:41am - 4:20 am PDT

*What happened:
*
Tests did not start on our Virtual platforms. The Web Application and REST API were unavailable.

*Why it happened:
*
Our primary database server had a kernel level issue that made the database unavailable. This caused many of our services to fail. The failover to our secondary database server began but did not complete quickly enough to prevent a significant impact upon our services.

*How we fixed it:
*
We completed the failover to the secondary database server manually.

*What we are doing to prevent it from happening again:
*
We have added logging to the servers so that we will be able to get more detailed information in the event there is another occurrence. Additionally, we’re implementing an improved failover system which will allow us to seamlessly shift from healthy to unhealthy database nodes in the event of an issue. Finally, we’ll continue to work with our support vendor to harden our DB implementation and attempt to determine a root cause for this specific issue.

Posted 2 months ago. Sep 07, 2018 - 08:57 PDT

Resolved
This incident has been resolved.
Posted 3 months ago. Aug 31, 2018 - 04:30 PDT
Update
We are continuing to monitor for any further issues.
Posted 3 months ago. Aug 31, 2018 - 04:24 PDT
Monitoring
A fix has been implemented and we are monitoring the results.
Posted 3 months ago. Aug 31, 2018 - 04:10 PDT
Identified
The issue has been identified and we are taking remedial action.
Posted 3 months ago. Aug 31, 2018 - 03:28 PDT
Investigating
New tests are not starting across all virtual platforms; our Website, REST API and Sauce Connect tunnels are unavailable. We are investigating.
Posted 3 months ago. Aug 31, 2018 - 02:56 PDT
This incident affected: Sauce Connect (Sauce Connect VM), Manual Testing (Manual VM Testing), REST API (REST API VMs), Web Interface (Sauce UI, Analytics), and Automated VM Testing (Automated PC Testing, Automated Mac Testing, Automated iOS Simulator Testing, Automated Android Emulator Testing).