Facebook, Instagram, WhatsApp suffer global outage
Facebook has suffered one of the most sustained outages in the its history. The cause of the 14-hour problem remains unknown, and issues continue with Facebook, Instagram, WhatsApp and Messenger.
The trouble began around 9:00 PDT on 13 March and continues to affect some services today (14 March). The cause remains a mystery but appears to be with the application software. A tweet from NBC journalist Raj Mathai blaming "database overload" has no corroboration, and earlier reports pointing to a leak of BGP Internet routing data (similar to the cause of last year's Google Cloud outage) have been ruled out.
Downer
Reports from DownDetector suggested that the problem was global, and peaked between 9:00 and 15:00 PDT for all three services, before tailing off, though there are still residual reports of problems with all three applications.
Facebook announced two years ago that it was moving data from its WhatsApp acquisition from IBM's cloud to its own data centers. The fact that WhatsApp failed in step with the mother ship could be evidence that project was successfully completed.
"We're aware that some people are currently having trouble accessing the Facebook family of apps," Facebook said in a tweet. "We're working to resolve the issue as soon as possible." The company added that "the issue is not related to a DDoS attack."
Facebook has faced disruption before, but not on the same scale. In 2015, it went down twice in one week, but each outage was less than one hour, and the service only had 1.5 billion monthly users at the time, compared with today's 2.3 billion. The previous year, in 2014, a botched software update took it out for 2.5 hours, and in 2010, a database problem disabled it. It has been down for a longer period, but that was in 2008 when the site had less than 150 million users.
Respected network firm Netscout has scotched earlier reports of a BGP error. Last night, various outlets reported that Netscout had found evidence of a leak of BGP routing data, but Roland Dobbins, a Netscout principal engineer, told Ars Technica's Dan Goodin that this was an internal "miscommunication" which resulted in an erroneous email being sent to journalists:
Dan Goodin @dangoodin001
Replying to @dangoodin001
Roland Dobbins, principal engineer at Netscout's Assert team, says he has no data whatsoever to support that claim that a BGP leak is the cause of today's Facebook or Instagram outages. "There was an internal miscomm here," he says of the email PR people sent to reporters.
The outages have also affected Facebook's ad-buying system, several brand marketers have tweeted about the issue. Facebook said that it is investigating the overall impact of the outage "including the possibility of refunds for advertisers."
2019 sales estimates put Facebook's daily ad revenue at $250 million/£189m, so any downtime for ad sales will be costly.
Network monitoring company ThousandEyes said: “The cause would appear to be internal rather than a network or Internet delivery issue - for example we saw '500 internal server errors' from Facebook. Given the sheer scale and continuous changes that these web scale providers are constantly making to their applications and infrastructure, sometimes things break as a result of these changes, even in the most capable hands.
"When investigating Facebook’s issues today, we’re not seeing any BGP changes that are affecting connectivity, packet loss or latency. Since Facebook uses its own backbone network, it’s not clear/we don’t have insight as to how an external transit route issue would cause a disruption within the internal Facebook network.”
In an effort to minimize the chances of an outage, Facebook has a team working on 'Project Storm,' which stress-tests data centers with various tests and drills, including turning off a data center entirely.
Source: datacenterdynamics
Industry: Data Centre News
Latest Jobs
-
- Public Sector Cyber Security Sales | UK
- England
- N/A
-
Public Sector Cyber Security Sales | UK UK | Remote / Hybrid A cyber security provider is seeking a Public Sector Sales professional to drive growth across UK government and public sector organisations. Must have current Cyber Security sales experience. Responsibilities Generate new business selling cyber security solutions into UK public sector Build relationships with CIO, CISO and senior technology stakeholders Manage the full sales cycle from opportunity to contract close Develop pipeline across central government, local government and public sector bodies Support bids, tenders and framework opportunities Experience Proven cyber security sales experience in the UK Track record selling into public sector organisations Familiarity with CCS, G Cloud or other government frameworks Strong stakeholder engagement and deal management skills Location UK based Security Requirements Eligible to obtain UK Security Clearance
-
- Security Architect | MoD - Security Cleared. OUTSIDE IR35 | Hampshire
- N/A
- Outside IR35
-
Security Architect | MOD | Security Cleared | Outside IR35 | Hampshire Commutable The successful candidate must be willing to undergo DV Clearance, ideally already holding active clearance. You will produce high and low level security architecture documentation, guiding and validating designs for systems deployed within sensitive environments. The role requires providing specialist security input into solution design, service transition and change initiatives, working closely with engineering, operations, client and third party stakeholders. You must have current hands on architectural experience, including VMware secure platform design and virtualisation architecture, alongside AWS expertise. This is an outside IR35 contract- 6 month rolling. Part of a longer term MoD project
-
- Active Directory | RBA engineer | UK Remote | SC Clearable
- United Kingdom
- N/A
-
Technical Active Directory (AD) and RBA specialist needed to play a key part in complex, enterprise scale Active Directory and access transformation programmes. You will work alongside senior team, helping reshape access models, modernise legacy directory structures and strengthen security posture across secure environments. This is hands on delivery within high impact projects where your work directly improves access control, compliance and operational resilience. Active UK Security Clearance required. This is a remote role with client travel. Implementation of Role Based Access Control across large AD estates Restructuring complex permission models, security groups and delegated access Supporting domain controller upgrades and core directory improvements Applying security hardening standards and remediating audit findings Enhancing authentication, policy and access governance frameworks Troubleshooting and resolving technical AD challenges within live environments Producing robust technical documentation and identifying project risks You must have the following technical experience Enterprise Active Directory administration Role Based Access and permission remediation OU design and governance Group Policy management Security group delegation models DNS and DHCP services Kerberos authentication / NTLM PowerShell scripting and automation Azure AD | Entra ID Hybrid identity environments Identity Governance PAM