Latency issues impacting a subset of EU Customers
Incident Report for Interact
Postmortem

The Anti-Virus solution was performing heavy-processing and caused a CPU spike which slowed down the data storage devices which are relied on by the web servers, which in turn affected the responsiveness of the sites leading to 500 errors. The Anti-Virus is normally very light but on this occasion, the CPU utilization jumped to 90% on that process alone. Interact is currently investigating the issue with the Anti-Virus vendor and we are currently evaluating alternative vendors for a replacement, as it is not the first time this issue has occurred. Interact's engineering team is also in the process of finalizing S3 backed storage capabilities to replace the native NFS server storage, which will remove the need for the anti-virus pipeline sharing the same hardware as the storage device, leading to better decoupling of security and operations. This will remove the possibility of heavy anti-virus processes affecting the throughput or health of the storage system which in turn may impact the customer sites.

Posted Oct 20, 2020 - 07:34 UTC

Resolved
This incident has been resolved.
Posted Oct 06, 2020 - 09:35 UTC
Monitoring
A fix has been implemented and we are monitoring the results.
Posted Oct 06, 2020 - 08:29 UTC
Investigating
Interact Engineers are currently investigating issues within the EU of higher than normal latency which is causing slowness and intermittent service disruption across all EU customers.
Posted Oct 06, 2020 - 08:24 UTC
This incident affected: EMEA Public Cloud.