Degraded performance
Incident Report for Magic
Postmortem

What happened

A spike in traffic occurred and our API pods did not scale out in enough time.

The following were not possible or experienced partly degraded service:

  • API Login
  • Logging into the Magic Dashboard

How we responded

During the outage, our engineers rapidly joined a virtual war-room to triage the situation and to find the fastest, most impactful step forward. After investigating the issue, we began to increase our overall server fleet size, which stabilized the traffic spike.

We also found that our scale-up policies had room for improvement. Our team immediately modified our scale-up policies to better accommodate for high traffic spikes in the future, which we were able to observe during later traffic spikes.

Posted Aug 11, 2022 - 11:55 PDT

Resolved
This incident has been resolved.
Posted Jul 24, 2022 - 20:16 PDT
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Jul 24, 2022 - 19:25 PDT
Investigating
You may temporarily experience increased latency and error rates. We are actively investigating the degraded performance issue.
Posted Jul 24, 2022 - 19:25 PDT
This incident affected: Authentication, API, and Dashboard.