Posted: 4 Min ReadFeature Stories

Researchers Seek to Make Machine Learning More Trustworthy

Easy to fool with subtle manipulations, machine learning needs a new teacher. Learn how researchers are fixing the problem

Let’s say you’re riding in a self-driving car and it comes to a stop sign at a busy intersection. But someone who wants to harm you, or maybe your car’s manufacturer, has placed a few stickers on the sign meant to trick your car’s sensors into not seeing it. So instead of stopping, you plow right into an 18-wheeler.

It’s an unsettling image in what has otherwise been a rosy forecast for our artificial intelligence future. But this is the kind of security-threat scenario that software engineers building AI—and the machine learning algorithms that underlie it—need to think about.

“Right now, with machine learning, it’s like the beginning of the internet,” says Somesh Jha, an expert on privacy and security issues at the University of Wisconsin-Madison. “Everyone is happy, and no one is thinking about adversarial consequences, just like with phishing attempts and malicious Java script. It’s the same with machine learning. No one is thinking about what happens with an adversary present. It’s history repeating itself.”

Machine learning is changing the way we live and work.

As companies race to apply machine learning to nearly everything in our lives—from cars, home security, and health care, to manufacturing, finance, aviation, and energy—there’s a lingering problem: These software programs are easy to fool through subtle manipulations. That can be dangerous—and costly—to consumers and corporations.

This past October, the National Science Foundation awarded a $10 million grant to Jha and several other computer scientists to make machine learning more trustworthy. The grant will help set up the Center for Trustworthy Machine Learning. The team has three goals: to develop new machine learning training methods that make it immune to manipulation, devise methods to defend against adversarial attacks, and look for potential abuses of current machine learning models, such as models that generate fake content.

“Machine learning is changing the way we live and work,” says Jha. “How do we protect against an adversary hacking your personal assistant and spending your money, or your front door smart lock to break into your home, or an armed drone to fire its weapons?”

To help prevent these things, researchers are “breaking” AI systems, intentionally tricking them with “adversarial examples” that probe and expose their weaknesses.

They are intentionally attacking spam filters that use machine learning to identify “bad” words and evading them by misspelling the “bad” words or inserting “good” words.” They are attacking computer security systems by disguising malware code that can mislead signature detection. And they are tricking biometric recognition systems by using the equivalent of a virtual fake moustache to impersonate a previously authorized user.

But some of the most startling and frightening examples have been in the field of object detection systems. One type of AI system known as deep neural networks must be taught to recognize and differentiate between say a cat and a house by being fed hundreds of different examples of each. But researchers have shown tweaking a single pixel can trick the machine into thinking a picture of a cat is instead a stealth bomber or guacamole.

In 2017, Anish Athalye, a first-year MIT grad student, and his colleagues created a 3D turtle with an engineered texture that made Google’s object detection AI into classify it as a rifle. “What if we created a rifle that looked like a turtle to a security system?” says Athalye. “You get an idea of the real-world impact. We need to recognize that our machine learning doesn’t work in an adversarial setting and we can’t really trust them.”

So how do we fix it? One way, says Athalye, is to use adversarial examples while training machine learning algorithms in order to help them identify adversarial attacks.

Another, which Somesh Jha is pursuing, is through what’s known as explainability machine learning, in which the algorithm must spit out reasons for its choices. For instance, a doctor might use machine learning to diagnose a patient. Based on various symptoms, the software might classify the patient as diabetic. “But why should a doctor presume it’s correct?” says Jha. “A disgruntled employee may have tampered with it.” By producing an explanation with a diagnosis, the doctor can “see if something looks weird.”

And while experts in machine learning may never come up with a single 100-percent tamper-proof model, just as with other security issues both real world and virtual, making things harder for the bad guys may be the best of all defenses.

Yet another way is for us humans to temper our expectations and not believe all the hype surrounding AI-infused technology and be cautious as we use it. “I have several levels of concern and the first is that people have unrealistic expectations of what this tech can do,” says, a computer security researcher at the University of Washington.

In 2017, Fernandes and other researchers used a machine learning algorithm to figure out that putting glossy, rectangular stickers on a stop sign could trick a car’s object-detection sensor into thinking it was a 45-mph speed limit sign. In another experiment, they made the sign disappear completely to the AI’s detection system. Fernandes has also performed security analysis on smart home appliances like refrigerators, door locks, and fire alarms.

"I am a computer security researcher, but I have interest in looking at emerging tech," says Fernandes. "My goal is to anticipate adversarial attack methods before the bad guys can think of them and then to build real defenses before these emerging technologies—in the internet of things and in the machine learning space—become widespread.”

Fortunately, malicious attacks on machine learning requires a level of sophistication and effort that all but the smartest and best-funded adversarial hackers may not possess or may find too cumbersome to develop. “In the security world we have a principle about how much effort an attacker has to expend to get something,” says Fernandes. “And if it’s more than the value of the thing they’re after, then it’s just not worth it to them.”

And while experts in machine learning may never come up with a single 100-percent tamper-proof model, just as with other security issues both real world and virtual, making things harder for the bad guys may be the best of all defenses. “Taking my work with the stop signs and the work of other researchers and combing as many things as possible so that the hacker gives up is the practical way to think about this,” says Fernandes. “If we create layers so they have to jump through a lot of hoops, they’re less likely to jump.”

You might also enjoy
Feature Stories7 Min Read

Machine Learning: Symantec’s Past, Present, and Future

Beyond the buzzwords: Here’s how powerful algorithms are creating strong protection for users

You might also enjoy
Product Insights3 Min Read

SEP 14.1: Prevention Evolved - better security through tunable machine learning

Better detection can be always be achieved if one is willing to make a more mistakes (false detections).

About the Author

P.K. Gray


P.K. Gray is a freelance technology writer covering the security and energy industries.

Want to comment on this post?

We encourage you to share your thoughts on your favorite social platform.