To cripple AI, hackers are turning data against itself
Data has powered the artificial intelligence revolution. Now security experts are uncovering worrying ways in which AIs can be hacked to go rogue
A self-driving car blows past a stop sign because a carefully crafted sticker bamboozled its computer vision. An eyeglass frame confuse facial recognition tech. The hacking of artificial intelligence is an emerging security crisis.
Pre-empting criminals attempting to hijack artificial intelligence by tampering with datasets or the physical environment, researchers have turned to adversarial machine learning. This is where data has been tweaked to trick a neural network and fool systems into seeing something that isn't there, ignoring what is, or misclassifying objects entirely.
Add an invisible (to humans) layer of data noise onto a photo of a school bus, as Google and New York University researchers did, and a neural network will report back that it's almost perfectly sure that's an ostrich. It's not only images: researchers have tucked hidden voice commands into broadcasts that can control smartphones without our puny human ears being any the wiser.
While such work is now described as an attack, adversarial examples were first seen as an almost philosophical blind spot in neural network design: we had assumed machines see in the same way as we do, that they identified an object using similar criteria to us. The idea was first described in 2014 by Google researchers in a paper on "intriguing properties of neural networks" that described how adding a "perturbation" to an image meant the network saw it incorrectly — which they dubbed "adversarial examples". Small distortions, they revealed, could fool a neural network into misreading a number or misclassifying that school bus. The work raised questions about the "intrinsic blind spots" of neural networks and the "non intuitive characteristics" in how they learn. In other words, we don't really understand how neural networks operate.
"Adversarial examples are just illustrating that we still just have very limited understanding of how deep learning works and their limitations," says Dawn Song, professor of computer science at University of California, Berkeley. Song was one of several researchers across four universities who developed the stop-sign stickers to confuse driverless cars.
"There is a whole spectrum [of attacks] depending on which phase of the machine-learning model generation pipeline the attacker is sitting at," says Earlence Fernandes, a computer security researcher at the University of Washington, who worked on the stop sign research. A training time attack, for example, occurs when the machine-learning model is being built, with malicious data being used to train the system, says Fernandes. "In a face detection algorithm, the attacker could poison the model such that it recognises the attacker’s face as an authorised person," he says.
An inference time attack, on the other hand, is showing specially crafted inputs to the model using a range of algorithms — the Fast Gradient Sign Method or the Carlini and Wagner attack are two popular methods — that subtly alter images to confuse neural networks.
As AI is permeates every facet of our lives — for driving cars, for analysing CCTV systems, for identity via facial recognition — attacks on such systems become all the more likely, and dangerous. Hackers modifying roadside furniture could cause car crashes and injuries. Subtle changes to the data machine learning systems are taught from could also lead to biases being actively added to the decisions AI systems make.
But we shouldn't be worried. Yet. "As far as we know, this type of attack is not being carried out in the real world by malicious parties right now," says Anish Athalye, a researcher at MIT. "But given all the research in this area, it seems that many machine learning systems are very fragile, and I wouldn't be surprised if real-world systems are vulnerable to this kind of attack."
Athalye's own research aimed to make adversarial attacks more robust. Some attacks, classed as "standard", only work from a specific viewpoint, while others work no matter what angle the neural network looks at the object or image. "Standard adversarial examples are created by slightly tweaking the pixels in an image to shift the classification toward some target class — making a picture of a cat be classified as a guacamole," he says. "Repeating this process again and again, making tiny changes, it turns out that it’s possible to make an image that looks like one thing to a person but confuses a machine into thinking it’s something else entirely." Research suggests that standard adversarial attacks were "fragile", he says, and not likely to hold up in the real world.
And so researchers such as Athalye and his colleagues at MIT and LabSix built better examples, optimising the attack image so it works regardless of angle or distance. "We also extended this to 3D objects, so you can have a physical object that looks like a turtle, for example, to a human, but looks like something completely different to a machine, no matter how it’s perceived," he says. That includes his 3D-printed toy turtle, that looks like a rifle to the ImageNet classifier.
An attack is of little use if it only works at a precise angle, or if the perturbation can be easily spotted by humans. Consider self-driving cars: they see the world via computer vision that relies on neural networks to identify objects. Any adversarial tricks would have to work at every angle a car would approach from, a distance and close up, and also not be noticed by human drivers — no one will be able to read a sign that's simply painted over. Researchers, including Fernandes and Song, managed that with subtle paint markings that didn't obscure the signs and with stickers that look like graffiti, but cause neural networks to interpret "stop" as a speed limit instead.
"At a high level, this kind of attack works by getting access to the target deep-learning model, and then running an algorithm to compute what edits need to be made to a physical object, so that it remains visually similar to the original type to a human, but appears as something else altogether to a machine learning model," says Fernandes. "In this case, our algorithm outputs the edits that need to be added. In our case, they are stickers, so we print them on paper, and simply stick them onto a physical stop sign."
That's no reason to panic. Simply slapping these stickers on a stop sign won't crash a self-driving car. Fernandes explains that self-driving cars use multiple sensors and algorithms and don't make decisions on any single machine-learning model. "So, although our work can fool a single machine-learning model, it does not imply that that fooling is enough to cause physical harm," he says.
Building adversarial examples is no easy task, often requiring access to technical details of a neural network, such as the model architecture, known as "white box" access. That said, robust attacks have been described that don't require detailed network information; those black-box attacks could prove more useful for outsiders to attack a system, as they're transferable across different neural networks.
Work is needed now to keep machine-learning from being rendered useless through its inherent weakness. Though there have been plenty of proposed solutions, there's so far no clear defence. "Defences that detect adversarial examples and defences that eliminate the existence of adversarial examples are an active area [of research], with new defences being proposed and those defences getting broken at a very fast pace," says Kevin Eykholt, a researcher at the University of Michigan. "When designing the machine learning systems, it is important to be aware of and possibly mitigate the specific risks of adversarial attacks, rather than blindly design the system and worry about repercussion if they happen," he adds.
One idea that shows promise, says Athalye, is efforts to train neural networks to spot adversarial images by including them in the training data. "This way, the network 'learns' to be somewhat robust to adversarial examples," he says.
That such flaws have been found at the core of machine learning isn't a surprise, says Fernandes, as systems usually aren't well tested until they become more widespread. "As machine learning has become more pervasive, it is only natural that security researchers started looking at it from an adversarial perspective, and found something that can be exploited," he says.
It's not only a technical flaw, but a philosophical assumption. First, machine-learning developers assume that training data and testing data would be similar, when attackers are free to manipulate data to their advantage. And second, we assumed neural networks think like us, when they really don't; the elements a neural network uses to identify a toy turtle is different than what we look for, and that gap is where these attacks live. "Neural nets are extremely crude approximations of the human brain," Fernandes says. "Trying to see them as operating in ways similar to us might not be the best way of thinking about them."
- Enterprise Business Development Director
- Up to £80,000 + Uncapped OTE
Our client, a Global Managed Service Provider, is seeking an Enterprise Business Development Director who will be responsible for scoping, identifying, creating and driving revenue growth across Europe and Asia at C level in the enterprise market. The Enterprise Business Development Director will need: Experienced in selling high value (multimillion) Managed Services and SDWAN to large enterprise parties· Have the ability to scope, identify and sell high value and complex managed solutions Extensive experience of commercial principles and contract negotiations with new global clients. Consistency of tenure in current and recent job roles Managing presentations, negotiations, and responsible for development/nurturing of the client relationship. Reference Number: BD7371
- Technical Design Authority (Telecoms, SDWAN, IOT, WAN, Hosted Services)
- Up to €90,000 plus car, bonus and benefits
Location: Frankfurt Technical design Authority is required to help lead a number of key client Migrations projects for this tier 1 Telecom company, the main role for the TDA is helping customers migrate to new services, with a focusing on hosting (AWS, Azure) SWWAN and IOT. You will be responsible for: Post sales design documentation, implementation and migration of complex solutions for managed enterprise customers. Complex solutions consist of multi-product services. The TDA’s role is to ensure that these services interoperate and integrate into the customer environment. Such products consist of but not limited to MPLS, Ethernet, IPSec VPN’s, VoIP, Video Conferencing, Wireless, Internet, Private DSL, WAN Optimization, Managed Security Services, Managed Hosting, SDWAN and Complex Migration Planning. The TDA will own the technical delivery of customer solutions and will be the technical interface between the customer, product teams and project management during service delivery. Close engagement with pre-sales, technically validating solutions proposed are deliverable and all technical aspects are clearly defined prior to contract signature. The TDA accepts technical ownership of the solution at the point of contract signature. Lead customer facing technical workshops requiring excellent communication with the ability to articulate technical concepts clearly to all levels of competency. Providing support to 3rd line teams for OEM and design related faults. You will need to be at CCIE level (ideally CCIE R&S or SP ) with strong low level design and deployment skills, comfortable in front of customers and leading customer meeting. Fluent German is required. Knowledge in SDWAN and Hosted services would be advantageous. Reference: RA7302
- Big Data Architect
- £70,000 + Benefits
A Big Data Architect is required for a leading Google Cloud partner. The Big Data Architect will be responsible for advising external customers(FTSE100) on Big data storage and transformation requirements on the Google Cloud platform. You will get the chance to be at the forefront of technology, regularly involved with Google Alpha tests – You will be shaping the future of google tech. Experience required; Public Cloud Architecture – Ideally Google but will consider people with an Azure or AWS background who are looking to move into GCP. Experience with Big Enterprise Data – Set ups, Flows, Pipelines etc. Strong SQL understanding – DataBase, Data Residency. Candidates must be based and eligible to work in the UK without sponsorship as our client do not have the ability to sponsor. Reference Number: PG7347
- NetIQ Consultant (Contract)
- £600 Per Day
A NetIQ Consultant is needed for a 6 month engagement in London. The NetIQ Consultant will be responsible for Designing, Configuring & Implementing Micro Focus Operations Centre & eDirectory Solutions. Required skills and experience Current experience with Micro Focus Operations Centre & eDirectory. SC Clearance is needed due to the nature of work. Reference Number: CH7363