Thursday September 28th 2023, 17:05 - 17:36 UTC
For the duration of the incident, customers could not start new tests in all regions.
A change that removed public cloud provider storage interfaces from the core Kubernetes codebase was introduced during a Kubernetes upgrade. We needed to rework large sections of our infrastructure management code to handle this change. In making these changes, we introduced an issue with applying firewall tags that broke key traffic outside the Kubernetes cluster. This was not detected in lower environments.
We manually rolled back the configuration that caused the issue.
We are addressing the underlying issue that caused the broken cluster networking. Once we have that fix we will plan a safe rollout.