To cripple AI, hackers are turning data against itself
Data has powered the artificial intelligence revolution. Now security experts are uncovering worrying ways in which AIs can be hacked to go rogue
A self-driving car blows past a stop sign because a carefully crafted sticker bamboozled its computer vision. An eyeglass frame confuse facial recognition tech. The hacking of artificial intelligence is an emerging security crisis.
Pre-empting criminals attempting to hijack artificial intelligence by tampering with datasets or the physical environment, researchers have turned to adversarial machine learning. This is where data has been tweaked to trick a neural network and fool systems into seeing something that isn't there, ignoring what is, or misclassifying objects entirely.
Add an invisible (to humans) layer of data noise onto a photo of a school bus, as Google and New York University researchers did, and a neural network will report back that it's almost perfectly sure that's an ostrich. It's not only images: researchers have tucked hidden voice commands into broadcasts that can control smartphones without our puny human ears being any the wiser.
While such work is now described as an attack, adversarial examples were first seen as an almost philosophical blind spot in neural network design: we had assumed machines see in the same way as we do, that they identified an object using similar criteria to us. The idea was first described in 2014 by Google researchers in a paper on "intriguing properties of neural networks" that described how adding a "perturbation" to an image meant the network saw it incorrectly — which they dubbed "adversarial examples". Small distortions, they revealed, could fool a neural network into misreading a number or misclassifying that school bus. The work raised questions about the "intrinsic blind spots" of neural networks and the "non intuitive characteristics" in how they learn. In other words, we don't really understand how neural networks operate.
"Adversarial examples are just illustrating that we still just have very limited understanding of how deep learning works and their limitations," says Dawn Song, professor of computer science at University of California, Berkeley. Song was one of several researchers across four universities who developed the stop-sign stickers to confuse driverless cars.
"There is a whole spectrum [of attacks] depending on which phase of the machine-learning model generation pipeline the attacker is sitting at," says Earlence Fernandes, a computer security researcher at the University of Washington, who worked on the stop sign research. A training time attack, for example, occurs when the machine-learning model is being built, with malicious data being used to train the system, says Fernandes. "In a face detection algorithm, the attacker could poison the model such that it recognises the attacker’s face as an authorised person," he says.
An inference time attack, on the other hand, is showing specially crafted inputs to the model using a range of algorithms — the Fast Gradient Sign Method or the Carlini and Wagner attack are two popular methods — that subtly alter images to confuse neural networks.
As AI is permeates every facet of our lives — for driving cars, for analysing CCTV systems, for identity via facial recognition — attacks on such systems become all the more likely, and dangerous. Hackers modifying roadside furniture could cause car crashes and injuries. Subtle changes to the data machine learning systems are taught from could also lead to biases being actively added to the decisions AI systems make.
But we shouldn't be worried. Yet. "As far as we know, this type of attack is not being carried out in the real world by malicious parties right now," says Anish Athalye, a researcher at MIT. "But given all the research in this area, it seems that many machine learning systems are very fragile, and I wouldn't be surprised if real-world systems are vulnerable to this kind of attack."
Athalye's own research aimed to make adversarial attacks more robust. Some attacks, classed as "standard", only work from a specific viewpoint, while others work no matter what angle the neural network looks at the object or image. "Standard adversarial examples are created by slightly tweaking the pixels in an image to shift the classification toward some target class — making a picture of a cat be classified as a guacamole," he says. "Repeating this process again and again, making tiny changes, it turns out that it’s possible to make an image that looks like one thing to a person but confuses a machine into thinking it’s something else entirely." Research suggests that standard adversarial attacks were "fragile", he says, and not likely to hold up in the real world.
And so researchers such as Athalye and his colleagues at MIT and LabSix built better examples, optimising the attack image so it works regardless of angle or distance. "We also extended this to 3D objects, so you can have a physical object that looks like a turtle, for example, to a human, but looks like something completely different to a machine, no matter how it’s perceived," he says. That includes his 3D-printed toy turtle, that looks like a rifle to the ImageNet classifier.
An attack is of little use if it only works at a precise angle, or if the perturbation can be easily spotted by humans. Consider self-driving cars: they see the world via computer vision that relies on neural networks to identify objects. Any adversarial tricks would have to work at every angle a car would approach from, a distance and close up, and also not be noticed by human drivers — no one will be able to read a sign that's simply painted over. Researchers, including Fernandes and Song, managed that with subtle paint markings that didn't obscure the signs and with stickers that look like graffiti, but cause neural networks to interpret "stop" as a speed limit instead.
"At a high level, this kind of attack works by getting access to the target deep-learning model, and then running an algorithm to compute what edits need to be made to a physical object, so that it remains visually similar to the original type to a human, but appears as something else altogether to a machine learning model," says Fernandes. "In this case, our algorithm outputs the edits that need to be added. In our case, they are stickers, so we print them on paper, and simply stick them onto a physical stop sign."
That's no reason to panic. Simply slapping these stickers on a stop sign won't crash a self-driving car. Fernandes explains that self-driving cars use multiple sensors and algorithms and don't make decisions on any single machine-learning model. "So, although our work can fool a single machine-learning model, it does not imply that that fooling is enough to cause physical harm," he says.
Building adversarial examples is no easy task, often requiring access to technical details of a neural network, such as the model architecture, known as "white box" access. That said, robust attacks have been described that don't require detailed network information; those black-box attacks could prove more useful for outsiders to attack a system, as they're transferable across different neural networks.
Work is needed now to keep machine-learning from being rendered useless through its inherent weakness. Though there have been plenty of proposed solutions, there's so far no clear defence. "Defences that detect adversarial examples and defences that eliminate the existence of adversarial examples are an active area [of research], with new defences being proposed and those defences getting broken at a very fast pace," says Kevin Eykholt, a researcher at the University of Michigan. "When designing the machine learning systems, it is important to be aware of and possibly mitigate the specific risks of adversarial attacks, rather than blindly design the system and worry about repercussion if they happen," he adds.
One idea that shows promise, says Athalye, is efforts to train neural networks to spot adversarial images by including them in the training data. "This way, the network 'learns' to be somewhat robust to adversarial examples," he says.
That such flaws have been found at the core of machine learning isn't a surprise, says Fernandes, as systems usually aren't well tested until they become more widespread. "As machine learning has become more pervasive, it is only natural that security researchers started looking at it from an adversarial perspective, and found something that can be exploited," he says.
It's not only a technical flaw, but a philosophical assumption. First, machine-learning developers assume that training data and testing data would be similar, when attackers are free to manipulate data to their advantage. And second, we assumed neural networks think like us, when they really don't; the elements a neural network uses to identify a toy turtle is different than what we look for, and that gap is where these attacks live. "Neural nets are extremely crude approximations of the human brain," Fernandes says. "Trying to see them as operating in ways similar to us might not be the best way of thinking about them."
- Information Security Manager- Global Sporting Brand. UK. £100,000
REFCH8265 Identifier Project Information Security Manager- Global Sporting Brand. UK. £100,000 A unique and exclusive opportunity to DCL Search to provide leadership and guidance Information and IT Security practices to one of the most recognised sporting brands in the world. You will be the envy of your colleagues, friends and peers as you take the lead in developing and implementing a security strategy. You must have a blend of knowledge across information security and technical security and be able to build internal and external stakeholder relationships. To coin a well known phrase, you should be a player manager. You don’t need to be currently hands on configuring firewalls, monitoring SIEM alerts, but maybe you have in the past. Ideally you will have come from a technical background as you will be closing be working with technical teams. Skills should include, but not be limited to: Managing / developing to Incident response plans. Information Security Risk Management / compliance. Security awareness Driving remediation plans to address vulnerabilities etc. Hybrid working. Up to £100,000 + benefits.
- Lead Information and Cyber Security Specialist, Financial Services. Exclusive to DCL Search
Consultative approach with experience engaging with internal stakeholders providing advice and guidance across information security policies and standards into projects and programmes. Risk identification / Assessment / Management across people and process. ISO27001. Open mindedness to take on projects and programmes that will involve advising, scoping, refining, improving technical security control relating to best practice. Preferred experience; PCI DSS ISA or consultative experience within security Payment card industry. Information Security / technical security controls within Financial Services. Risk Assessment / management across technical controls. Technical Security background. Experience within secure by design and the technical security controls relating to projects / programmes. iSO27001 Lead Implementer / Auditor. CISA, CISM, CISSP. 2 days a fortnight in London- or more if you want.. Hybrid reworking.
- Cyber Security Associate, Financial Services. Exclusive to DCL Search
Exclusive Cyber Security Associate needed within a forward thinking financial services business head quartered in London. DCL Search have been engaged on an Identifier Project to attract the very best cyber talent to this business. Influence the cyber security capability and direction within the business. Learn new skills working within a collaborative team. Grow as a security professional. ROLE Triaging and troubleshooting security alerts at a level 1 / level 2 capacity. Reviewing security change management requests. Managing and use of security tooling such as; Endpoint management Vulnerability management Patch management CASB Experience with the following tools is desirable. ZOHO Desktop Central (Endpoint Management) Splunk (SIEM) Qualys CASB (Microsoft) Microsoft Azure Varonis DatAdvantage ADAudit Plus Sonicwall, Paloalto, Dark Trace, Cloudflare, Cisco Umbrella, Microsoft defender.
- Senior Cyber Security Engineer, Financial Services. Exclusive to DCL Search
Exclusive Senior Cyber Security Engineer needed within a forward thinking financial services business head quartered in London. DCL Search have been engaged on an Identifier Project to attract the very best cyber talent to this business. Influence the cyber security capability and direction within the business. Learn new skills working within a collaborative team. Grow as a security professional. ROLE Day to day operations, management and scalability of existing cyber security systems Managing of and maturing security tooling such as; SIEM Endpoint Management Firewall Patch Management CASB Vulnerability management. Triaging and troubleshooting security alerts. Improve tooling, reducing false positives. Reviewing, approving, escalating security change management requests. Implementing new cyber security systems. Ideal technical experience · Vulnerability Management: Qualys · Endpoint Management: ZOHO Desktop Central · Forcepoint: CASB, DLP, webs security, email security. · SIEM (Splunk) · Firewalls: Sonicwall, Palo Alto · Endpoint Microsoft Defender · Appreciation of ISO27001, GDPR, PCI, etc 2 days a fortnight in London- or more if you want.. Hybrid reworking.