To cripple AI, hackers are turning data against itself
Data has powered the artificial intelligence revolution. Now security experts are uncovering worrying ways in which AIs can be hacked to go rogue
A self-driving car blows past a stop sign because a carefully crafted sticker bamboozled its computer vision. An eyeglass frame confuse facial recognition tech. The hacking of artificial intelligence is an emerging security crisis.
Pre-empting criminals attempting to hijack artificial intelligence by tampering with datasets or the physical environment, researchers have turned to adversarial machine learning. This is where data has been tweaked to trick a neural network and fool systems into seeing something that isn't there, ignoring what is, or misclassifying objects entirely.
Add an invisible (to humans) layer of data noise onto a photo of a school bus, as Google and New York University researchers did, and a neural network will report back that it's almost perfectly sure that's an ostrich. It's not only images: researchers have tucked hidden voice commands into broadcasts that can control smartphones without our puny human ears being any the wiser.
While such work is now described as an attack, adversarial examples were first seen as an almost philosophical blind spot in neural network design: we had assumed machines see in the same way as we do, that they identified an object using similar criteria to us. The idea was first described in 2014 by Google researchers in a paper on "intriguing properties of neural networks" that described how adding a "perturbation" to an image meant the network saw it incorrectly — which they dubbed "adversarial examples". Small distortions, they revealed, could fool a neural network into misreading a number or misclassifying that school bus. The work raised questions about the "intrinsic blind spots" of neural networks and the "non intuitive characteristics" in how they learn. In other words, we don't really understand how neural networks operate.
"Adversarial examples are just illustrating that we still just have very limited understanding of how deep learning works and their limitations," says Dawn Song, professor of computer science at University of California, Berkeley. Song was one of several researchers across four universities who developed the stop-sign stickers to confuse driverless cars.
"There is a whole spectrum [of attacks] depending on which phase of the machine-learning model generation pipeline the attacker is sitting at," says Earlence Fernandes, a computer security researcher at the University of Washington, who worked on the stop sign research. A training time attack, for example, occurs when the machine-learning model is being built, with malicious data being used to train the system, says Fernandes. "In a face detection algorithm, the attacker could poison the model such that it recognises the attacker’s face as an authorised person," he says.
An inference time attack, on the other hand, is showing specially crafted inputs to the model using a range of algorithms — the Fast Gradient Sign Method or the Carlini and Wagner attack are two popular methods — that subtly alter images to confuse neural networks.
As AI is permeates every facet of our lives — for driving cars, for analysing CCTV systems, for identity via facial recognition — attacks on such systems become all the more likely, and dangerous. Hackers modifying roadside furniture could cause car crashes and injuries. Subtle changes to the data machine learning systems are taught from could also lead to biases being actively added to the decisions AI systems make.
But we shouldn't be worried. Yet. "As far as we know, this type of attack is not being carried out in the real world by malicious parties right now," says Anish Athalye, a researcher at MIT. "But given all the research in this area, it seems that many machine learning systems are very fragile, and I wouldn't be surprised if real-world systems are vulnerable to this kind of attack."
Athalye's own research aimed to make adversarial attacks more robust. Some attacks, classed as "standard", only work from a specific viewpoint, while others work no matter what angle the neural network looks at the object or image. "Standard adversarial examples are created by slightly tweaking the pixels in an image to shift the classification toward some target class — making a picture of a cat be classified as a guacamole," he says. "Repeating this process again and again, making tiny changes, it turns out that it’s possible to make an image that looks like one thing to a person but confuses a machine into thinking it’s something else entirely." Research suggests that standard adversarial attacks were "fragile", he says, and not likely to hold up in the real world.
And so researchers such as Athalye and his colleagues at MIT and LabSix built better examples, optimising the attack image so it works regardless of angle or distance. "We also extended this to 3D objects, so you can have a physical object that looks like a turtle, for example, to a human, but looks like something completely different to a machine, no matter how it’s perceived," he says. That includes his 3D-printed toy turtle, that looks like a rifle to the ImageNet classifier.
An attack is of little use if it only works at a precise angle, or if the perturbation can be easily spotted by humans. Consider self-driving cars: they see the world via computer vision that relies on neural networks to identify objects. Any adversarial tricks would have to work at every angle a car would approach from, a distance and close up, and also not be noticed by human drivers — no one will be able to read a sign that's simply painted over. Researchers, including Fernandes and Song, managed that with subtle paint markings that didn't obscure the signs and with stickers that look like graffiti, but cause neural networks to interpret "stop" as a speed limit instead.
"At a high level, this kind of attack works by getting access to the target deep-learning model, and then running an algorithm to compute what edits need to be made to a physical object, so that it remains visually similar to the original type to a human, but appears as something else altogether to a machine learning model," says Fernandes. "In this case, our algorithm outputs the edits that need to be added. In our case, they are stickers, so we print them on paper, and simply stick them onto a physical stop sign."
That's no reason to panic. Simply slapping these stickers on a stop sign won't crash a self-driving car. Fernandes explains that self-driving cars use multiple sensors and algorithms and don't make decisions on any single machine-learning model. "So, although our work can fool a single machine-learning model, it does not imply that that fooling is enough to cause physical harm," he says.
Building adversarial examples is no easy task, often requiring access to technical details of a neural network, such as the model architecture, known as "white box" access. That said, robust attacks have been described that don't require detailed network information; those black-box attacks could prove more useful for outsiders to attack a system, as they're transferable across different neural networks.
Work is needed now to keep machine-learning from being rendered useless through its inherent weakness. Though there have been plenty of proposed solutions, there's so far no clear defence. "Defences that detect adversarial examples and defences that eliminate the existence of adversarial examples are an active area [of research], with new defences being proposed and those defences getting broken at a very fast pace," says Kevin Eykholt, a researcher at the University of Michigan. "When designing the machine learning systems, it is important to be aware of and possibly mitigate the specific risks of adversarial attacks, rather than blindly design the system and worry about repercussion if they happen," he adds.
One idea that shows promise, says Athalye, is efforts to train neural networks to spot adversarial images by including them in the training data. "This way, the network 'learns' to be somewhat robust to adversarial examples," he says.
That such flaws have been found at the core of machine learning isn't a surprise, says Fernandes, as systems usually aren't well tested until they become more widespread. "As machine learning has become more pervasive, it is only natural that security researchers started looking at it from an adversarial perspective, and found something that can be exploited," he says.
It's not only a technical flaw, but a philosophical assumption. First, machine-learning developers assume that training data and testing data would be similar, when attackers are free to manipulate data to their advantage. And second, we assumed neural networks think like us, when they really don't; the elements a neural network uses to identify a toy turtle is different than what we look for, and that gap is where these attacks live. "Neural nets are extremely crude approximations of the human brain," Fernandes says. "Trying to see them as operating in ways similar to us might not be the best way of thinking about them."
- Contact 12 month- Security Operations- Crowdstrike Falcon Insight EDR / Analyst.
- United Kingdom
- Dependent on experience
Security Operations engineer / Analyst with Crowdstrike Falcon Insight EDR experience for a 12 month contract. Experienced Contractor with Crowdstrike Falcon Insight: Endpoint detection and Response (EDR) experience needed - 12 month rolling project. Implementation, configuration and Analyst experience needed with Crowdstrike Falcon Insight: (EDR) Migration project- relocating capability internationally. technically implementing, configuration of that that migration and then transition to BAU role monitoring. DCL Search exclusive associate Project.
- SailPoint Consultant
- Upto £75,000 plus benefits
SailPoint Consultant is needed for an expanding Financial Service business, this is an exciting time to join the Business as they are in the Process of deploying both IAM and PAM solutions and this consultant will form a key part of the IAM team Location can be flexible but would require the individual to come into the London office a couple of times a month for team meetings and face to face project reviews Duties include · Engage in the Identity & Access Management project to deliver SailPoint IdentityNow and Privileged Access Management · On-board applications and users into IAM tools and customise or configure integrations as required · Regularly review, secure and recertify privileged roles in applications, databases and operating systems · Implement least privilege, just-in-time access, password rotation and vaulting wherever possible · Migrate application authentication to Single Sign-On through the use of SAML and OAuth · Implement and enforce the use of MFA where possible, focusing on critical applications and risky sign-ins · Provide technical support to Centrify and SailPoint users Key experience required: Previous experience with SailPoint, including integrating and deploying into a business, onboarding users and applications, supporting users and performing manual administration tasks. Experience with SAML and OAuth to migrate applications to Single Sign-on. If you are interested in hearing more please reach out to me for more information
- Centrify Consultant
- Upto £75,000 plus benefits
A Privileged Access Management Consultant is needed for an expanding Financial Service business, this is an exciting time to join the Business as they are in the Process of deploying a Centrify PAM solution,, this consultant will form a key part of the team Location can be flexible but would require the individual to come into the London office a couple of times a month for team meetings and face to face project reviews Duties include · On-board applications and users into PAM tools and customise or configure integrations as required · Regularly review, secure and recertify privileged roles in applications, databases and operating systems · Implement least privilege, just-in-time access, password rotation and vaulting wherever possible · Migrate application authentication to Single Sign-On through the use of SAML and OAuth · Implement and enforce the use of MFA where possible, focusing on critical applications and risky sign-ins · Provide technical support to Centrify users You would also gain expsoure with the IAM toolset as part of an Identity Access deployment. Key experience required: Previous experience with a PAM tool (Centrify would be an added bonus but not essential) including integrating and deploying into a business, onboarding users and applications, supporting users and performing manual administration tasks. Experience with SAML and OAuth to migrate applications to Single Sign-on. If you are interested in hearing more please reach out to me for more information
- SOC team lead- Deputy SOC manager - Managed Security Services, Bradford. Exclusive
- £70,000 +
SOC team lead- Deputy SOC Manager - Managed Cyber Security Services, Bradford. Exclusive Identifier project. Technical team lead needed to join a Managed Cyber Security Services business. The role will be a hands on lead role and technical escalation point for the team. You will also be responsible for leading, mentoring, growing and developing the team. You will be the deputy SOC manager and be involved in the strategic growth of the capability. A managed security services background is essential, specifically within a managed security operations capability. Current hands on support experience across Firewall, SIEM, Incident Response is essential.