Cloudflare experienced a widespread service failure on Tuesday after an internal configuration issue caused major disruptions across the internet affecting platforms such as X ChatGPT Canva and several other high-traffic websites. The outage persisted for nearly five hours and users were repeatedly met with HTTP 500 internal server errors. Cloudflare’s Co-Founder and CEO Matthew Prince later explained that the issue originated from within the company’s systems and was not related to any form of cyberattack or external threat.
According to Prince this incident marked the company’s most significant outage since 2019. The disruption was caused by a permissions modification inside one of Cloudflare’s database environments. This change resulted in the creation of an unusually large feature file used by the Bot Management system. Once this oversized file exceeded expected limits it was pushed across Cloudflare’s network and triggered failures in proxy services attempting to process it.
The problematic file was traced back to a recurring query operating on a ClickHouse database cluster. Because certain nodes in the cluster had software updates while others did not the system periodically generated flawed versions of the file. This inconsistency caused the network to cycle between functioning and malfunctioning intervals making troubleshooting far more difficult for the engineering team.
Initially engineers suspected a massive Distributed Denial of Service attack because of the sudden spike in failures. However after identifying the misconfigured file the team stopped its distribution restored a functioning version and restarted essential proxy components. Services began stabilising around 14:30 UTC and Cloudflare confirmed complete recovery by 17:06 UTC.
Multiple Cloudflare tools and services faced interruptions during the event. Core CDN and security systems showed elevated HTTP 5xx error rates. The Turnstile verification service malfunctioned preventing many users from logging into the Cloudflare Dashboard. Workers KV registered high error activity due to gateway failures. Email security features also experienced reduced accuracy because of temporary access loss to critical IP reputation data.
Prince acknowledged the severity of the incident and reaffirmed Cloudflare’s commitment to building more resilient systems stating that failures like this drive the company to strengthen its infrastructure and prevent similar events in the future.

