Is the Glue Holding Our Digital Services Together Melting?

Last week, Google and Oracle essentially passed out from the heat – the cause? “Cooling issues”, we were told. With that in mind, Ross Gray, CEO, Cloudsoft, discusses the issues and offers some solutions.

Our daily lives are so entwined with internet-delivered digital services that even a minor outage can have a huge impact. And lately, they feel more fragile than ever. 

If it isn’t configuration errors locking millions out of their Facebook and Instagram accounts; or thousands of EasyJet passengers being hopelessly stranded at airports because of ‘technical issues’; then it’s record-breaking temperatures bringing two of the UK’s highest profile data centres to a grinding halt this last week, as happened when cloud services and servers hosted by Google and Oracle essentially passed out from the heat – the cause? “Cooling issues”, we were told.

Britain’s blistering weather was alarming on many levels – not least, the horrific scenes in north London as wildfire swept through tinder-dry gardens in normally quiet suburban streets.

But, once again, we have experienced wide-scale IT failures exposing just how ill-equipped our critical digital infrastructure is for a future when 40-degree summers are ever more frequent.

The Google and Oracle shutdowns led to outages for countless thousands of everyday customers, large and small, including WordPress websites hosted by WP Engine in the UK, which it claims provides “the fastest, most reliable” hosting for more than 1.5 million websites globally. Both these technology giants simply shut down parts of their systems, without warning, to protect the stability of their entire network.

These are global IT heavyweights, being significantly impacted by seemingly unprecedented events – causing disruption across the IT sector. We saw headlines about them because of their size. But think of the number of small, medium and large organisations, too, disrupted by this and by their own weather-related IT issues, during just two sweltering days.

Climate experts predict extreme weather events could become business as usual in years to come. The potential for a national ‘meltdown’, for want of a better word, however, has been on the horizon for a while.

Research by the Atlas VPN team in April found that more than three-quarters of companies (76%) across the globe had experienced unexpected network downtime in the past year alone. System crashes, human errors, and cyberattacks were the primary causes.

It’s an undeniable fact, more organisations than ever rely today on increasingly complex IT environments and digital supply chains. And as a result, they are more fragile and open to disruption than ever before.

  • The Essential Chief Resilience Officer – find out what it takes here

In February Gartner released a report calling organisational resilience a “strategic imperative”, adding it considered it “the antidote to fragility induced risk” now hanging over “technology estates, supply chains, internal processes and more”.

It concluded, as operations become more complex and sophisticated, every business can expect potentially terminal disruption at some point along their digital supply chain and operating line.

As organisations’ digital footprints grow, they can become more complex and fragile at the same time as they are becoming more exposed to the impacts of climate disruption. And this is worsened by a “just in time” attitude many businesses are employing to reduce costs and increase turnover: a single point of delay or disruption will have a domino effect, risking failure across the business model.

This is particularly important for organisations within highly regulated sectors, such as financial services. Imagine the catastrophic national economic damage and unprecedented market turmoil if a major bank was floored by a system-wide outage because it got too hot outside?

The EU is currently developing new regulations around Digital Operational Resilience – commonly known as DORA – and the UK’s Financial Conduct Authority has taken similar steps.

There are promising signs that change is starting to happen at corporate levels too. Major players including HSBCVirgin Money and BNP Paribas have recently been recruiting widely for operational resilience staff and, according to LinkedIn, more than 7,500 operational resilience experts have started in new roles over the past two years.

Even as the UK mercury dropped we saw offices around the world again brought to a shuddering halt after the Microsoft Teams’ platform was hit by a global outage, disrupting tens of thousands of customers for hours.

Resilience is now a business prerequisite for survival in the digital age. Every organisation should be using last week’s headline outages as a stark wakeup call to ensure their IT is robust and prepared to maintain operations should a major incident occur.

What were once exceptional occurrences are now happening all the time, to anyone, no matter their size, for a myriad of different reasons.

Last week’s record-breaking heatwave may not have actually caused the glue holding the internet together to melt, but it brutally exposed how ill-prepared digital infrastructure is to deal with record-breaking temperatures.

As more extreme weather becomes the norm and companies continue to add to their already complex IT environments, we can expect to see more events like this occurring. Businesses must take steps to address their vulnerabilities – whether they are regulated or not. The economic, reputational and operational costs are too severe not to.

By Ross Gray, CEO, Cloudsoft.

Guest Contributor
Guest Contributor
Follow on Twitter @eWeekUK

Popular Articles