Network Issue
Incident Report for Think Cloud
Postmortem

RFO Report for Network Service Incident

Summary

At approximately 17.00 on Tuesday 9th August 2022 our monitoring alerted us to a network affecting issue resulting in a loss of all connectivity to our network.

Upon investigation we identified that we had lost communication with both core switches resulting in a network wide outage.

Our network operations team attempted to hard reboot the switches to resume service however their attempts were unsuccessful.

At 17.40 our field engineer arrived at rack and rebooted the affected devices. This action resumed our network operation and traffic levels started to recover.

At 18.15 we identified that our traffic levels had recovered to only 30% of what they were prior to the incident.  Deeper investigation showed this was potentially caused by DNS caching on our internal resolvers.  Flushing the DNS cache resolved the issue and traffic levels resumed to normal levels.

At 18.50 our network engineering team performed previously scheduled emergency maintenance to replace a routing-engine on our core router.

Investigation and Root Cause Analysis

Further investigation and consultation with vendors technical assistance centre we have identified that this issue was caused by a manufacturer known firmware issue due in our current switch configuration.

Posted Aug 16, 2022 - 11:45 BST

Resolved
This incident has been resolved.
Posted Aug 10, 2022 - 11:47 BST
Monitoring
Our engineers have isolated the issue and traffic levels appear to be stabilising.
Our network team will continue to monitor the network.

If you have any issues ,please contact the helpdesk.
support@cloudspaceuk.co.uk
Posted Aug 09, 2022 - 19:13 BST
Update
Our engineers are still working on implementing a fix.
There may some instability while this is happening

We will provide another update in 15 minutes.
Posted Aug 09, 2022 - 18:53 BST
Update
Our engineers are still working on implementing a fix.
There may some instability while this is happening

We will provide another update in 15 minutes.
Posted Aug 09, 2022 - 18:36 BST
Update
Our engineers are still working on a fix.
There may some instability while this happens.

We will provide another update in 15 minutes.
Posted Aug 09, 2022 - 18:21 BST
Identified
The issue has been identified and a fix is being implemented.
Posted Aug 09, 2022 - 18:05 BST
Update
Our engineers have located the issue and are working on a fix.

Services should start coming back online shortly.

We will update in another 15 minutes or sooner.
Posted Aug 09, 2022 - 18:05 BST
Update
We are continuing investigating the issue.

We will provide another update in 15 minutes.
Posted Aug 09, 2022 - 17:44 BST
Update
We are continuing to investigate the issue.
We will be providing updates every 15 minutes.
Posted Aug 09, 2022 - 17:27 BST
Investigating
We are aware of a network outage currently affecting our services.

We have engineers working on the situation and will update this page as we know more information
Posted Aug 09, 2022 - 17:18 BST
This incident affected: Core Network, Cloud Network, and Dedicated Network.