Date: October 3, 2018
Time: 12:01 pm - 2:02 pm PDT; degraded service from 2:02 pm to 2:55 pm PDT
*Wait times for VMs in our PC cloud were above normal.
*Why it happened:
*A issue was discovered in our VM dispatcher service that is triggered under higher than normal load. The issue caused it to underreport the total number of VMs available.
*How we fixed it:
*The problem resolved itself as load dropped.
*What we are doing to prevent it from happening again:
*We’ve made changes to our dispatcher retry logic to stabilize our capacity under high load. We’re continuing to work on improvements to the logic used by the dispatcher service to improve the resiliency of its distributed reporting under load. We are also adding additional capacity.