Microsoft Azure cloud services suffered a global outage after a Domain Name System (DNS) update mangled domain records that lasted for more than an hour and took out a host of Microsoft cloud services Thursday afternoon. Microsoft Azure also took out third-party apps and sites running on Microsoft’s cloud. Microsoft Azure also took out some third-party apps and sites that were running on Microsoft’s cloud.
Between 19:43 and 22:35 UTC, the global outage impacted a number of Microsoft cloud services, causing connection problems for core services like Azure, multiple services under the Microsoft 365 umbrella, Dynamics, and DevOps.
The incident had a knock-on effect for Azure compute, storage, App Service, Azure AD identity services, and SQL Database.
“Engineers are investigating DNS resolution issues affecting network connectivity. Connectivity issues are resulting in downstream impact to Compute, Storage, and Database services, and some customers may be unable to file support requests.”
“More information will be provided as it becomes available. Some customers may start to see recovery.”
After investigating the outage, Microsoft confirmed that “users may be unable to access Microsoft 365 services or features”, adding that it had “identified and corrected a DNS configuration issue that prevented users from accessing Microsoft 365 services and features”
“We’ve observed an increase in successful connections and our telemetry indicates that all services are recovering. We’re continuing to monitor the environment to validate that service has been restored.”
The three-hour outage has since come to an end, with Microsoft confirming that its engineers have mitigated the issue and that most services have been recovered.
“Engineers identified the underlying root cause as a nameserver delegation change affecting DNS resolution and resulting in a downstream impact to Compute, Storage, App Service, AAD, and SQL Database services,” it said.
“During the migration of a legacy DNS system to Azure DNS, some domains for Microsoft services were incorrectly updated. No customer DNS records were impacted during this incident, and the availability of Azure DNS remained at 100% throughout the incident. The problem impacted only records for Microsoft services.
“To mitigate, engineers corrected the nameserver delegation issue. Applications and services that accessed the incorrectly configured domains may have cached the incorrect information, leading to a longer restoration time until their cached information expired.”