High Error rates impacting a subset of EU Customers
Incident Report for Interact
Postmortem

Early on April 23, 2021, we released a new version of Interact Web for Public Cloud (61.1) and an update for Pulse (2.0.0). Around 10:30am UK time, we saw a rise in web server traffic and wait times along with login server spikes/latency. We increased our web farm to double its normal size hoping that this would alleviate any contention issues and continued to monitor this situation, as we felt traffic and access would stabilize shortly. Around 10:53am UK time, we received our first report from customers that they were having login issues and Error 500s. As we performed troubleshooting, we felt that something might be wrong in our AMIs for our new build of servers (for Interact 61.1 / Pulse 2.0.0). In order to prevent additional downtime, we reverted the release for Interact Web 61.1 and Pulse 2.0.0, thereby restoring the login/web servers back to the original AMI builds. At 11:30am UK time, we completed the revert to the old version of Interact Web/Pulse and observed that normal login and web activity was occurring. So we moved this incident into a monitoring phase. Two hours later, we closed this incident.

Posted Apr 27, 2021 - 11:14 UTC

Resolved
We sincerely apologise for the recent disruption to your Interact site. Our Infrastructure team have now fully resolved this issue and a full post mortem will be published on here as soon as our investigations have been completed.

Again, we sincerely apologise for the disruption this may have caused you and your intranet users.
Posted Apr 23, 2021 - 12:32 UTC
Monitoring
We have identified and implemented a fix. We will continue to monitor the error rates before confirming the issue as being resolved.

We thank you for your patience.
Posted Apr 23, 2021 - 10:34 UTC
Update
We are continuing to investigate the situation.

We sincerely apologise for the disruption this is causing you and your intranet users.
Posted Apr 23, 2021 - 10:24 UTC
Investigating
Interact Engineers are currently investigating issues within the EU of higher than normal error rates and intermittent service disruption across all EU customers.
Posted Apr 23, 2021 - 09:56 UTC
This incident affected: EMEA Public Cloud.