Service Disruption to customers hosted in EU Cloud
Incident Report for Interact
Postmortem

Summary

On 21 January 2019 between 10:52am (UTC) and 11:10am (UTC) customers hosted in the EU Pod experienced a number of failed requests for static assets resulting in broken themes (due to missing CSS, JS and images).

Investigation and Root Cause

Upon investigation, Engineers identified that 6 (out of 12) servers had come into the rotation and failed to map to the underlying asset store. Typically new servers come into rotation every 6 hours as part of the automatic scaling nature of the Interact application.

Interact believes that a recent update of our APM (Application Performance Monitoring) agent may have been responsible for the issue, however pre-release testing and subsequent investigations have proved inconclusive. Interact responded to the issue by forcing a rotation of new hardware.

Resolution and Mitigation Steps

Subsequent to the issue, Interact has updated its automatic deployment process to attempt the remap to the underlying asset store a minimum of 5 times if a failure of mapping occurs.

Posted Jan 29, 2019 - 10:32 UTC

Resolved
This issue has now been resolved. We apologise for the inconvenience this has caused and will publish a full post mortem.
Posted Jan 21, 2019 - 11:10 UTC
Identified
The issue has been identified and a fix is being implemented.
Posted Jan 21, 2019 - 10:59 UTC
Investigating
Engineers are investigating reports of broken themes/display issues on EU public cloud. Updates to follow.
Posted Jan 21, 2019 - 10:52 UTC
This incident affected: EMEA Public Cloud.