Service Status

Service Status

Platform Incident - Assured and Elevated - ECC - All Zones in Regions 2, 8 Sunday 18th February 2018 12:00:00


Our suppliers have identified an air conditioning fault which is impacting services in regions 2 and 8.

This incident is currently under investigation and customers will be updated with any progress as soon as possible.

Incident resolved.

Any customers with VM's in regions 2 and 8, if you are still experiencing issues please reboot all VMs. If you have further issues please contact the support team.

Any customers in region 5, zone B please refer to seperate notification. We are currently troubleshooting this issue with one of our vendors

Rolling reboots have now been completed for all zones.
We will continue to monitor all zones to ensure service functionality has been restored. Customers may find their VMs have been restarted. Please contact UKCloud Support if you experience an ongoing outage with your service.

Rolling Reboots are still in progress for zone EC1. Zone AC1 is in now back in service. Snapshot Protection service will remain suspended for the remainder of the night.

Rolling Reboots are still in progress for zones AC1 and EC1.
Zone EE1 in now back in service. Snapshot Protection service will remain suspended for the remainder of the night.

We are continuing to recover any lost hosts/service in zones EE1, EC1, AE1 and AC1. Zones AC3 and AC2 are back in service Snapshot Protection service remains suspended until all service is restored.

We are now working through each zone to recover any lost hosts/service in zones EE1, EC1, AE1, AC3 & AC2. This will involve a reboot of said devices, which will incur a vMotion of any resident VM.
Zone AC1 remains at risk due to a faulty network switch that is impacting storage connectivity. This switch is currently being replaced.
Snapshot Protection service remains suspended until all service is restored.

We have identified numerous items of failed hardware (as a result of the period of increased temperature) which we are currently awaiting a Field Engineer to attend to replace. In parallel we will be performing rolling reboots across Regions 2 and 8 in order to correct VMs which have been discovered to currently be running in memory. To further lessen the load on the environment we have elected to disable Snapshot Protection across Regions 1, 2, 7 and 8 for tonight.

As these actions will likely take several hours to complete, our next update may not be for a few hours to allow us to concentrate on performing the remediation work.

We are still identifying network devices which self-powered down to protect their components which we are powering back up. Some of these are failing POST due to hardware faults caused by the increased operating temperature that was in effect for the duration of the cooling issues.

We will continue to identify impacted services once we have regained full network connectivity but for now our priority is restoring the network.

The ambient temperature within Region 2 and Region 8 has now returned to normal levels since the supplier cooling issues were resolved.

We are continuing to identify network devices which self-powered down to protect their components which we are powering back up. We will continue to identify impacted services once we have regained full network connectivity.

The root cause has been identified and the cooling issue resolved, we are continuing to attempt to restore management access so we can identify which services need to be restored.