US Errors on Image loads

Incident Report for Interact

Postmortem

Root Cause Analysis

On the morning of May 3rd, 2021, part of Interact's infrastructure became unstable and less responsive to incoming requests. This led to an increase in latency, which was first addressed by restarting related servers. While this mitigated a number of issues, a piece of the cloud environment responsible for serving user-uploaded images remained unable to effectively process requests in a timely fashion. 

Resolution

To fix this issue, Interact engineers rotated servers related to this piece of the platform and services returned to normal.

Remediation

Development is underway on a project to overhaul and upgrade a portion of Interact's cloud infrastructure to improve site reliability.

Posted May 06, 2021 - 07:51 UTC

Resolved

This incident has been resolved. Please contact us if you have any further issues.
Posted May 03, 2021 - 15:20 UTC

Monitoring

A fix has been implemented and we are monitoring the results.
Posted May 03, 2021 - 14:07 UTC

Identified

The issue has been identified and a fix is in progress.
Posted May 03, 2021 - 14:03 UTC

Investigating

Our team is receiving reports of Error 502s, 503s and broken images. Our engineers are currently investigating this issue.
Posted May 03, 2021 - 13:42 UTC
This incident affected: North America / HIPAA Public Cloud.