Thursday September 8th 2022, 17:53 - 19:44 UTC
Mobile app uploads were intermittently failing for the duration of the incident.
Multiple long-running queries in "list_files" endpoint on our app storage service caused timeouts that made uploading requests fail. Some customers reported approximately 30 minutes of failed attempts to upload mobile apps.
Several database tables had growth that led to their respective indexes becoming stale, causing queries to those tables to perform poorly. This growth was directly correlated to a spike in requests to the service that handles app storage.
We ran a data archival script on the tables in question, rebuilt indexes, and added additional indexes to restore query performance.
The team has identified the introduction of rate limiting on this service as a way to control performance in a more orderly fashion.