Date: May 26, 2017
Time: 12:16-1:23 pm PDT
Tests and Sauce Connect Tunnels could not be started and there were high wait times for VMs.
Why it Happened:
After adding more capacity to the cloud, the DHCP (Dynamic Host Configuration Protocol) service did not restart successfully due to a configuration error. Because VMs for new tests rely on DHCP to obtain an IP address, VMs for new tests were unable to start.
How did we fix it:
We fixed the error in the DHCP configuration file and restarted DHCP.
What are we doing to prevent it from happening again:
- Add checks so that DHCP is not restarted when this configuration error is present.
- Add checks so that if DHCP does not successfully restart, we are notified earlier.