What can financial services firms learn from the AWS outage?

An outage of Amazon cloud services that impacted thousands of businesses and millions of users has raised serious questions around Big Tech monopolies – and how firms can increase their cyber resilience.

27 October 2025 6 mins read
By Jay Hampshire
Written by humans

Written by a human

In brief:

  • An outage of Amazon Web Services (AWS) affected more than 2,000 companies worldwide, including major banks and stock exchanges
  • The world’s largest cloud provider blamed faults at a data centre hub in northern Virginia, causing serious disruption for nearly 15 hours
  • The issue has raised questions around how organizations can increase their resiliency despite dependency on a handful of Big Tech providers

Sometimes resolving an IT issue is as simple as “turning it off and on again.” Sometimes, an issue is so severe that it takes down half of the internet. Such was the case on 21 October as companies ranging from Snapchat to Starbucks, Lloyds to Lyft were affected by a serious and sustained outage of Amazon Web Services (AWS) cloud access.

DNS.O.S – What caused the AWS outage?

The “significant API errors and connectivity issues” experienced by those using AWS cloud services were attributed to “faults” in Amazon’s data center hub in northern Virginia, the company’s “oldest and biggest” site.

The fault was reported to be with a database service known as DynamoDB and left systems unable to match website names with the numerical IP addresses used to load pages and applications. Many will recognise this as an incredibly mundane and common kind of outage – a Domain Name System (DNS) error.

That such a commonplace error should cause such widespread disruption was seen as a cause for concern by many. Amazon stands as the world’s largest cloud services provider, ahead of Microsoft and Google, and it is estimated that the outage affected more than 2,000 companies and 8.1 million users worldwide. This included financial services organizations like the London Stock Exchange Group, crypto exchange Coinbase, banks including Lloyds and Halifax, and government organizations including HM Revenue & Customs – as well as economic powerhouse Roblox.

Cause for concern

While Amazon confirmed that “all AWS services returned to normal operations” by Monday evening, the scale of the near-15-hour disruption has raised concerns about the number and diversity of organizations relying on cloud services provided by a scant handful of Big Tech operators – meaning that, when issues arise, they cause major disruption to businesses and individuals.

The AWS outage comes just over a year since a “routine software update” from endpoint security provider CrowdStrike caused the largest IT outage in history, impacting over 8.5 million Microsoft Windows devices and leaving banks, airlines, healthcare providers, media outlets, and businesses unable to access their systems.

Commentators raised issue with the potential for disruption that large-scale cloud services being provided by a small number of providers can cause, with Dr Corinne Cath-Speth of Article 19 saying:

“We urgently need diversification in cloud computing. The infrastructure underpinning democratic discourse, independent journalism and secure communications cannot be dependent on a handful of companies.”

The outage also raised concerns with the U.K. government, with the Treasury Committee sending a letter to the Economic Secretary to the Treasury trying to establish how the government had been affected, why AWS had not been designated as a “critical third party” which would bring them under FCA cyber resilience regulations, and whether – considering AWS has won over £1.7 billion of U.K. government contracts in the last decade – Amazon’s claims that its “comprehensive approach to resilience” ensures “businesses can reliably maintain operations” can be substantiated.   

Plan of (cyber)attack

Whether the result of a cyber-attack or an unforeseen systems outage, we are seeing an increase in firms and service providers being hit by disruptive events. The scale and lasting impacts are also increasing, with the recent hack of Jaguar Land Rover systems being described as the most “economically damaging cyber event” in U.K. history, resulting in a month-long production outage and estimated £1.9 billion economic impact.

Firms are already subject to a range of regulations that govern minimum standards of operational resilience and preparedness in order to reduce potential risks from cyber-attacks or service outages. Financial Industry Regulatory Authority (FINRA) Rule 4370 stipulates that firms must establish written Business Continuity Plans (BCPs) and automated data backup and recovery, as well as undertaking risk assessments of critical third parties. Similarly, the Commodity Futures Trading Commission (CFTC) has proposed an operational resilience framework built on business continuity and disaster recovery plans and third-party relationship programs to manage risk. The Office of the Comptroller of the Currency (OCC) has set five “baseline requirements” for operational resilience that require firms to define tolerances for disruption and identify critical activities across core business lines.    

With slightly ironic timing, the day before the AWS outage saw a joint publication from the Bank of England, Prudential Regulation Authority (PRA), and Financial Conduct Authority (FCA) that highlighted effective practices that financial firms have been observed utilizing to increase cyber response and recovery. While targeted at cyber-attack risk rather than operational outages, implementing these insights would give firms a solid baseline of cyber resiliency that would insulate them from risks of all kinds.

  • Firms should ensure they are “testing cyber disruption scenarios that are appropriately severe.” – While wholesale outages like the AWS scenario that impact systems at a foundational level are considered “worst case,” firms that prepare for scenarios with high impact will find themselves more resilient and prepared.
  • Stakeholder communication strategies must be in place, resilient, and fit for purpose during severe disruption. Crisis communication plans must be pre-defined, transparent, and timely, and take into account all customers, broader stakeholders, and regulators.
  • A clear plan for restoring critical data from immutable backups is essential, and this must include assessing whether a firm can rebuild critical applications and core infrastructure that support business services or fail over to a separate environment. Firms must recognise that restoring significant volumes of data takes considerable time, and should prioritize making sure important business services can be restored and recovered quickly.
  • Critical third parties must ensure their resilience capabilities are equivalent to the ones firms expect from their own infrastructure, and firms must consider third parties as part of a broader incident response framework that includes clearly defined roles and responsibilities.

The joint document reminds us that “cyber threat continues to evolve, and third-party dependencies are increasing,” and that firms, regulators, and third-party suppliers must work together in partnership to ensure resilience remains high. Cautious firms will be reassessing their level of dependencies on single service suppliers and taking steps to ensure their eggs aren’t all kept in one basket – because, should that basket collapse, it would result in quite the mess.


While businesses and organizations increasingly rely on large scale public cloud providers to underpin many of their services, doing so presents considerable risk. From cyber attacks to service outages, issues with cloud services can lead to impacts including revenue and reputational loss, and falling foul of regulatory resilience requirements. Seeking out solutions hosted in purpose-built private cloud environments that maximize data security and availability can significantly increase resilience.

SUPPORT 24 Hour