Among a growing list of everyday uses, machine learning powers the silicon brains behind digital assistants like Apple’s Siri or Amazon’s Alexa, and allows intelligent control systems to steer self-driving cars.
But even though machine learning solutions have become incredibly smart in recent years, the predictive models and algorithms that support them still have some issues with robustness—in other words, the ability to return a correct answer when faced with noisy, or murky, information.
It’s an important problem, because the real world is an inherently noisy place. So in machine learning, a lack of robustness could, for example, cause a self-driving car to miss a stop sign if the sign appears different from what’s “normal”—perhaps someone has placed a sticker on the red, octagonal face.
“For applications that involve human lives you want to be very confident in the decisions these predictive models make,” says Dimitris Papailiopoulos, an assistant professor in the Department of Electrical and Computer Engineering at the University of Wisconsin-Madison.
Papailiopoulos is taking an unconventional approach to make machine learning algorithms and models more robust by drawing from the field that helps our cell phone conversations come through without being garbled. That’s an area called coding theory.
“We’re finding that coding theory can be better than the current state-of the-art solutions for robustness,” says Papailiopoulos. “It can give us a significant jump in performance.”
It’s a strategy that earned Papailiopoulos a prestigious CAREER Award from the National Science Foundation. With support from the award, which honors the nation’s most promising young faculty members, Papailiopoulos plans to tackle robustness on two fronts—both in the foundational training process of machine learning models as well as in their real-world deployment.
Making machines intelligent: The ABCs
Like methods that drive human learning, machine learning models get smarter through a process called training, during which these models parse vast catalogs of well-labeled data. For example, to create a classifier that can recognize cats, a training algorithm might trawl though thousands and thousands of feline images.
Similarly, an algorithm for a self-driving car might train on numerous pictures of street signs.
Crucially, machine learning algorithms develop the “rules” that allow predictive models to classify objects without any human input. That means the key features and patterns that an algorithm uses to distinguish a cat from a tree from a stop sign can be completely different than the obvious features that a human might observe.
Training also takes time and substantial computational resources. Even then, it can take weeks on a single mainframe computer—so to speed things up, programmers often run these algorithms in parallel on several hundred processing units at once.
But parallel computing isn’t perfect, either: Sometimes one or two of the units will lag behind the others, or crash, or return some other unexpected error, losing information and compromising the algorithm’s robustness.
A new call to action: Adding encoded info
That’s where coding theory comes in. Almost all modern devices that store or transmit information make use of coding theory. However, Papailiopoulos is among the first researchers to apply these techniques to machine learning algorithms.
“Coding theory is the best way we have to ensure robustness in the face of uncertainty,” he says. “The big difference is that machine learning is not just bits being stored, it’s an algorithm that acts on your data.” Cell phones use coding theory to prevent snippets of conversation from becoming lost. CDs deploy the same techniques to play music from a scratched disc. It’s the field that allows us to reconstruct a complete parcel of data even when some of the individual bits go missing.
The simplest strategy to prevent data loss is replication—sending the same message 1,000 times as insurance against the 10 percent of instances where a bit will be lost. But rampant replication wastes resources, so coding theory has developed methods for sending and storing encoded versions of the information. That can correct for more errors at lower computational cost.
Similarly, coding theory can offer a path toward robustness for machine learning. If one of the computational nodes lags behind during model training, coding theory can prevent information loss.
“Instead of encoding bits, we’ll encode algorithmic outputs,” said Papailiopoulos.
But Papailiopoulos isn’t only focused on the training aspect of machine learning. He also plans to tackle the far-more difficult task of ensuring robustness when these predictive algorithms function in the real world.
What you see isn’t always what you get: Ensuring real-world accuracy
Machine learning algorithms learn in a “perfect” world: They’re fed a careful diet of well-labeled datasets that enable them to develop classification rules. But the real world is not perfect, which is why algorithms can get thrown off by something as simple as a stop sign that’s slightly askew.
Such mistakes imply that current machine learning solutions are not robust to real world non-idealities.
“The signal that comes from the world is completely uncurated,” says Papailiopoulos. “You can’t control in real-time the images that are received by an autonomous vehicle.”
Ensuring robustness is an ongoing challenge for the entire machine learning field—but it might be possible to strategically include the types of images that are most likely to trip up machine learning algorithms in initial training sets. That’s something that Papailiopoulos is actively pursuing.
“That’s why this is interesting,” he says. “Being sure that things that are deployed in practice, will do not just what they are prescribed to do, but also that the models will work in a safe way.”
Contributing to the conversation: Machine learning for many
Papailiopoulos is also passionate about education and outreach that helps people understand machine learning and coding theory. He’s developing a new undergraduate-level course in electrical and computer engineering with a focus on probability and statistics for machine learning and data science. He will teach the class in a fully flipped mode, meaning that his students will watch recorded lectures on their own time, then use the scheduled class meetings for collaborative problem-solving.
Additionally, Papailiopoulos mentors undergraduates in projects that harness machine learning to help the Madison community. He’s partnered with the water-quality-focused nonprofit organization Clean Lakes Alliance to advise students as they create an algorithm for predicting algal blooms from weather conditions. He also plans to develop educational modules about fundamental coding theory concepts for middle and high school students who attend the Wisconsin Institute for Discovery’s Saturday Science programs.
“I’m very excited to give back to the community,” says Papailiopoulos.
Author: Sam Million-Weaver