New informatics and machine learning research group aims to provide undergrads with cutting-edge skillset
42. That’s the answer to the ultimate question of life, the universe, and everything. Or rather, it’s the answer offered by the fictional supercomputer Deep Thought— after 7.5 million years of calculations—in Douglas Adams’ 1978 novel The Hitchhiker’s Guide to the Galaxy.
This astoundingly simple and inscrutable answer to basically he most existential question out there frustrates the novel’s protagonists, and understandably so. And though it drives Adams’ plot forward, the author has said that his selection of 42—clearly meant as a joke—was random.
But Adams conjured up Deep Thought in the 1970s, when the idea of a super powerful, super intelligent supercomputer remained solidly in the realm of science fiction. It shared page space with bureaucratic alien regimes, galactic travel and a piece of technolog
y called the infinite improbability drive. And while we may be no closer to interstellar travel today than we were in 1978, we now have casual access to incredible computing powers
With the use of sophist
icated algorithms, computers are able to answer our simple questions and ponder complex problems we face.
Consider Google. The Internet search giant has amassed more than $360 billion in less than 20 years largely by employing algorithms to mine data. These algorithms answer more than 1.2 trillion search queries per year. And Google’s algorithms are not just blindly spitting back answers—they interact with users. They’re
constantly learning users’ tastes and inte
rests, both at an individual level and in aggregate. The algorithms learn those tastes to tailor results, suggest alternate queries and—most importantly for Google’s bottom line—to help advertisers target audiences.
Google is hardly the only player in algorithm-based computer learning. In 2011, IBM’s powerful computer Watson famously defeated two former Jeopardy! champions to win $1 million on the game show. Watson has gone on to provide advice on lung cancer treatment to caretakers at some of the most highly regarded hospitals in the United States, and the computer now offers its expertise to businesses around the world. Other large tech companies are pouring money into similar endeavors, and academics are beginning to grasp the potential offered by these powerful computing methods.
At UW-Madison, an interdisciplinary undergraduate research group in the College of Engineering is beginning to unlock this power of computer “intelligence,” helping students solve practical engineering problems and giving them a leg up on internship and job applications in an increasingly data-driven world. It’s called the Informatics Skunkworks. It all began with just a few materials science and engineering students under the direction of Materials Science and Engineering Professor Dane Morgan in the spring of 2015. It’s expanded to about 15 undergraduates and several graduate students working under professors in disciplines within and outside of the College of Engineering.
Informatics is the science of extracting information from data. One way to think about it, offered by Aren Lorenson, a materials science and engineering senior involved in the Skunkworks, is to consider a library. The library is enormous, its books are comprehensive in their coverage of some facet of human knowledge, and to a visitor who enters with a specific question in mind, the library’s seemingly limitless material poses an overwhelming challenge.
“You’re not going to be reading through those stacks of books any time soon, if ever,” says Lorenson. “They exist though, and they could be put to use. Informatics does that for you in a reasonable amount of time. You could write a program to pick apart all the useful information. The act of that is super fast. Let’s say my program can read a 100-page book in 30 seconds, if even that. So now I’ve picked apart the book and I have all the useful bits that I want. Now I’ve got to learn it.”
And that’s exactly what informatics and machine learning offer—a set of computing-based tools that are able to take the information you’ve thrown at them, digest it, learn from it, and offer you potential solutions to your query. “It’s a fascinating area,” says Morgan. “I think it’ll chage the world as we know it, and it’s a fantastic area for undergraduates because the technology is so powerful and so new and so available.”
Morgan says the skills undergraduates learn in the Skunkworks are useful in increasingly relevant ways.
“By participating in Skunkworks, students are learning about the technologies of informatics that are undergoing enormous advancement today,” Morgan says. “The biggest tech companies in the world are committing an enormous amount of resources—billions of dollars—to machine learning and informatics, and the tools are progressing at an incredible rate.”
Research in fields as disparate as self-driving cars and cancer treatment can benefit from informatics and machine learning. Which is to say nearly any research question could benefit from big data and the algorithms that extract its useful information. The Skunkworks allows undergraduates to engage in these cutting-edge research questions and to discover solutions quickly. It’s an environment that Morgan and other faculty involved in Skunkworks are hopeful will be conducive to exciting opportunities and collaboration among bright young minds. “When you put new technologies that are powerful together with young people and give them the mandate to explore, great things happen,” says Morgan about the group.
Another MS&E professor, Paul Voyles, conducts research on the properties of metallic glasses. His research needs the muscle of a machine that can learn and detect patterns within a background that would be otherwise be indecipherable to even those most careful human observers. Glasses are an inherently disordered structure, which Voyles says makes finding structural patterns within them extremely difficult.
“An analogy I typically use is that atoms in a crystal, which is most metal alloys, are arranged like eggs in an egg carton,” Voyles says. “They all sit in a regular array, whereas atoms in a glass are more like a bunch of marbles in a jar—they’re all jumbled up. They’re all about the same distance from each other, but they don’t have that regular repeating order.”
This random jumble of atoms makes identifying the important relationships a significant problem. Voyles has written a set of equations based on experimental data that tells him that the important data he wants is there, but buried beneath a mountain of unimportant random data. “Peeling that apart is really hard,” Voyles says.
It’s also subject to human bias, as human brains are wired to find patterns in randomness. Voyles says machine learning can remove hardwired human bias from scientific observations: “These machine learning and image science kinds of approaches allow us unbiased, quantitatively rigorous ways to identify these patterns without the human bias interference.”
The results Voyles hopes for include designing new metallic glass alloys, which show particular promise for nanotechnology applications, including nano medical implants that resist corrosion in body fluids. These tiny medical implants could revolutionize western medicine, acting as real-time biosensors that could detect disease indicators or symptoms or deliver targeted drug therapies.
That’s just one area in which machine learning could impact medicine. Julie Mitchell is a professor of biochemistry and mathematics who has used machine learning for years. Mitchell studies molecules—the enzymes, antibodies and DNA that make up the basic framework of biological life. Those biomolecules are themselves made up of amino acids, nucleotides and a few other basic building blocks.
“We’re interested in understanding the physics and chemistry underlying the effects observed when you change out one of those building blocks,” Mitchell says. “Essentially, we want to characterize mutations and how they might affect molecular interactions. You change one of these amino acids, and you want to know how much that disrupts the ability of the
molecules to bind.”
Mitchell and her team have used machine learning to build predictive models for these types of questions, including questions on the processes involved in muta-genesis. These are tough questions—like how exactly does mutation in a protein that’s implicated in cancer actually disrupt normal physiological processes and lead to disease? Understanding protein interactions helps inform strategies to block them and potentially stop the disease.
Much of Mitchell’s research is analogous to areas of research in materials science, where machine learning is just taking off. One of her undergraduate biochemistry advisees who is interested in bio-informatics recently became involved in the Skunkworks.
“I think it’ll be a great opportunity for sharing ideas between our group and materials science, because we’ve been working on this type of approach in biomaterials for almost 10 years,” Mitchell says. “Undergraduates don’t get as much face time where they can ask low-level questions in these kinds of research environments. The Skunkworks allows them to leverage things that other people at their level can learn and to share techniques and ideas.”
It’s this type of interdisciplinary interaction that Mitchell, Morgan, Voyles and other professors involved in Skunkworks are especially excited about.
“Our hope is that we can team students with different backgrounds to work on some of these different projects so that they’ll be able to tease their research questions apart with each other,” says Becca Willett, an associate professor of electrical and computer engineering. “This will hopefully give them a lot of good preparation for industry jobs and maybe attract some of them to further studies or graduate school. I think it’ll be exciting for a lot of these students.”
Among other research endeavors, Willett develops new machine learning software to help researchers analyze data. She’s also involved in an interdisciplinary group of UW-Madison researchers who work on machine learning tools. It’s the mission of the Skunkworks to use those tools to impact science problems.
“The students are learning about these advanced technologies that exist in software and computing hardware to help in understanding and processing data,” Morgan says. “They’re learning how to program. They’re also learning a lot about teamwork because they attack a lot of these problems in teams.”
Graduate students and experienced undergraduates are paired with Skunkworks newcomers who are still getting their footing in machine learning. They work on specific problems but the ultimate goal is to gain capability with large amounts of data and the tools that make it useful.
“The students are learning a lot about communicating with one another,” Morgan says. “We’re trying to push a very open culture. We certainly want to give intellectual credit for what people do, but they’re expected to share all of their code, all of their understanding and to work on datasets with other people.”
That peer cooperation is a key skill for graduate-level research and success in industry that the Skunkworks will hopefully instill in undergraduates, Morgan says. “I hope this group will publish papers, generate patents and give people skills that make them highly attractive to industry,” he adds.
The Skunkworks already has one project with a company looking at its data, and the student on that project will start an internship with the company soon.
The group has benefited greatly from the generosity of faculty and the college. Until a few months ago they had to find space where they could, borrowing cramped lab quarters. But they now have a permanent home in a old electro-magnetics lab space, kindly shared by Willett and her colleague, Robert Nowak, also professor of electrical and computer engineering. Funds to support the group needs, which range from snack food in the lab to monitors for data visualization and computational resources, are generously provided by the College of Engineering through flexible funds given to Morgan.
Currently, the students can either opt to receive credit for their work, or a small stipend, but the latter is only available when faculty have the resources. Morgan says establishing an endowed fund for the Skunkworks is critical to its growth and long-term viability. Many students cannot afford to work for free and cannot spare time for the Skunkworks on top of another job. With stable support the Skunkworks could easily grow to twice its present size, Morgan says, engaging undergraduates from the college, the wider campus, and the rest of the UW System.
“At its best the Skunkworks brings together a group of technically oriented people who spend time with one another talking, learning and creating,” Morgan says. “This type of activity changes who you are; it changes how you think about what you want to be in your life. It evolves who you call your friends and who your consider your peers, and adds to those critical groups people who think computer technologies and machine learning are fun. Encouraging such interactions will help produce tomorrow’s leaders using informatics and machine learning. I hope that being involved with the Skunkworks will have a really meaningful impact on everyone who participates, wherever their life takes them.”
Author: Will Cushman