Lessons from the October 2025 AWS DNS Outage

The Domino Effect: AWS DNS Outage - DNS resolution failure triggering a cascade of falling dominos representing AWS services (DynamoDB, Lambda, ECS, API Gateway) leading to your business

Early on October 20, 2025, Amazon Web Services (AWS) experienced a significant outage affecting its US-EAST-1 region in northern Virginia. The root cause was DNS resolution failures for DynamoDB's API endpoints, which cascaded across AWS's interconnected services. The incident disrupted major platforms including Signal, Snapchat, Fortnite, Reddit, Coinbase, Ring, Amazon's own Alexa and Prime Video services, and banking applications for institutions like Bank of Scotland, Halifax, and Lloyds.

This outage demonstrates a fundamental truth about modern cloud infrastructure: DNS failures create disproportionate impact. When DNS resolution fails, even perfectly healthy servers become unreachable, effectively taking services offline.

Timeline and Technical Details

AWS first reported elevated error rates across multiple services at approximately 3:11 AM ET. The company identified the root cause as DNS resolution issues affecting DynamoDB API endpoints in the US-EAST-1 region. DynamoDB serves as a foundational service for many AWS offerings, so DNS failures preventing applications from resolving DynamoDB endpoints created cascading failures across more than 70 AWS services.

By 6:35 AM ET, AWS reported that the DNS issue had been "fully mitigated" and service operations were returning to normal. However, some services experienced lingering effects as backlogs cleared and systems recovered. The total disruption lasted approximately 3-4 hours for most affected services.

The incident highlights how DNS is a critical dependency of distributed systems. When DNS resolution fails, services can't locate the API endpoints they depend on. This creates the same observable failure as if those endpoints were completely offline, even though the underlying infrastructure might be functioning normally.

Why DNS Failures Have Outsized Impact

DNS resolution is typically one of the first steps in any network communication. When an application needs to communicate with a service like DynamoDB, it must first resolve the service's hostname to an IP address. If this resolution fails or returns incorrect information, the entire communication chain breaks down.

In AWS's case, internal DNS resolution problems meant that services couldn't locate DynamoDB endpoints. Because many AWS services use DynamoDB for state management, session storage, and data persistence, this single DNS issue propagated across a large portion of AWS's infrastructure.

This cascading effect demonstrates why DNS monitoring deserves special attention in your infrastructure reliability strategy. Unlike many other failure modes that might affect individual services or components, DNS failures can simultaneously impact everything that depends on the affected domain.

The Case for DNS Monitoring

While AWS's internal DNS infrastructure isn't directly accessible to external monitoring, organizations running on AWS can monitor their own DNS configurations to catch issues before they affect users. DNS Check provides monitoring for several scenarios that could prevent or mitigate outage impacts:

Name server availability monitoring: Track whether your authoritative name servers are responding to queries. If a name server becomes unreachable, DNS Check detects this immediately and sends alerts, giving you time to investigate before users are affected.
Resolution correctness: Verify that DNS queries return the expected values. Incorrect DNS responses can route traffic to the wrong endpoints or create partial outages that are difficult to diagnose.
Geographic consistency: DNS Check queries from multiple global locations, helping you detect geolocation-based routing issues or inconsistent responses from different name servers.

For organizations depending on cloud infrastructure, DNS monitoring provides an early warning system. While you can't control AWS's internal DNS infrastructure, you can monitor your own DNS records and receive immediate notification when issues occur. This visibility helps you respond quickly, whether that means communicating with customers, activating failover systems, or contacting your service provider.

Building DNS Resilience

The AWS outage reinforces several architectural principles for DNS resilience:

Use multiple authoritative name servers: Most domains configure multiple authoritative name servers, providing redundancy if one name server fails.
Monitor DNS responses continuously: Automated monitoring detects DNS issues faster than manual checks or user reports. DNS Check monitors your records at regular intervals and alerts you immediately when problems occur, helping you maintain awareness of your DNS infrastructure's health.
Understand your DNS dependencies: Map out which systems depend on DNS resolution for critical services. This understanding helps you assess the potential impact of DNS failures and prioritize monitoring for your most important records.
Implement appropriate timeout and retry logic: Applications should handle DNS resolution failures gracefully. Configure reasonable timeout values and implement retry logic that accounts for temporary DNS issues without creating excessive load on DNS infrastructure.
Consider multi-region and multi-cloud strategies: While complex to implement, distributing your infrastructure across multiple regions or cloud providers can reduce the impact of regional outages. This typically requires careful DNS configuration to route traffic appropriately and failover mechanisms to handle regional failures.

Looking Forward

Large-scale outages like this AWS incident serve as valuable reminders that no infrastructure is immune to failures. DNS's role as a foundational Internet protocol means that DNS issues often create widespread impact. By implementing proactive DNS monitoring, organizations gain visibility into this critical infrastructure layer and can respond quickly when problems occur.

DNS Check monitors your DNS records and sends notifications when problems are detected, helping you maintain reliable DNS infrastructure. Whether you're running a small application or managing DNS for a large organization, monitoring provides the visibility needed to catch issues early and respond effectively.

Ready to add DNS monitoring to your infrastructure? Sign up for a free DNS Check account and start monitoring your critical DNS records today.