Researchers at the University of Wisconsin–Madison are using computers in new ways to develop a comprehensive picture of how people communicate about politics, and how those conversations can be shaped by media, social networks and personal interactions.
What their computer analysis finds, the researchers hope, could help bridge the divide between people on either side of the political aisle who are unable to come together to solve society’s problems because they can’t even talk to each other — so much so that they might as well be speaking different languages.
“One of the most important questions for us is: Does the communication system help people to understand the problems they define in their social and political lives?” says Lewis Friedland, a professor in UW–Madison’s School of Journalism and Mass Communication. “Or, do we have a system that actually exacerbates divisions among people — that makes it easier to divide up into ‘ingroups’ and ‘outgroups,’ to see others as unlike us or unworthy?”
Drawing on social media posts, public opinion polling, news coverage and in-person interviews from across Wisconsin stretching back to 2010, Friedland and collaborators will paint a picture of political interactions as a living, changing environment — a “communication ecology” — with webs of interaction between people and institutions in the state. Supported by funding from the UW2020 initiative, it is one of the most ambitious efforts ever to understand how people in an entire state talk about politics, and how those conversations have changed over time.
“No one has attempted to model communication ecologies on a statewide level, especially over eight years,” says Friedland. “It takes enormous creativity in gathering data, modeling relationships and developing analysis methods.”
The researchers are harnessing the power of machine learning, in which UW–Madison is a leading innovator, to detect how people of opposite political persuasions assign different meanings to the same words.
For example, the word “regulation” can carry substantially different connotations — “helpful and necessary” or “onerous and invasive” — for liberals and conservatives. While those sentiments might seem intuitive, it’s difficult to rigorously define and quantify exactly how people assign meanings to words.
Machine learning offers a solution to that problem by transforming words into geometric concepts called vectors and using mathematical operations to make comparisons.
“Vectors show you something about the words,” says William Sethares, a UW–Madison professor of electrical and computer engineering and collaborator on the project. “Simple things like synonyms will have similar vectors, and vectors for analogous words will have the same relationships to each other.”
Vectors are abstract objects that have length and direction; in two dimensions, a vector looks like an arrow symbol. Word vectors are similar to simple arrows, except they exist in many more dimensions. Even though it would be impossible to draw word vectors on a flat sheet of paper, the representations for “king” and “queen” would, in a sense, point in the same directions with respect to each other as those for “boy” and “girl.”
After comparing vectors from roughly 2,000 tweets posted by liberals, conservatives and nonpartisans, the researchers identified the top 10 words with different usages between political ideologies, including “politician,” “government” and “environment.”
Revealing those differences required a new computational approach, developed by Sethares and graduate student Prathusha Sarma.
The process of transforming words into vectors is called embedding, and it typically involves programming algorithms to trawl through massive amounts of text, like the entirety of Wikipedia or every Google news story ever published.
The problem is that the powerful generic word embeddings from giant databases like Wikipedia often miss nuances in language — after all, every word becomes one single vector, so terms with multiple meanings can confuse even the smartest algorithms (think of “hack,” which can describe either what an ax does, a computer invasion, or an untalented writer).
While those subtle differences might emerge in specific data sets, like the text of 2,000 political tweets, there simply wouldn’t be enough words to construct accurate vectors.
“Any small niche uses words in its own way,” says Sethares. “The things that work really well require billions of words, so we’re caught in a trap because we can’t train algorithms on a small data set.”
Instead, Sethares and Sarma found an effective method to combine the strength of word embeddings derived from Wikipedia with the specificity of political tweets. Their algorithm not only identified words that conservatives and liberals use differently, but also predicted the political ideology of a tweet’s author with roughly 90 percent accuracy based on language alone.
Sethares and colleagues plan to apply the same machine learning approaches to Wisconsin political news and campaign speeches. The approach could enable them to draw comparisons between political dialogue in urban and rural communities as well as examine how partisan word meanings may have shifted over time.
They then will combine information about word meanings with additional layers of data, including insights from in-person interviews, election results and historical statistics from public opinion polling. The resulting communication ecology will offer unprecedented insights into how the Wisconsin political environment is evolving.
“The environment is getting noisier and noisier,” says Friedland. “People who have limited time and attention can only focus on so much in a given day.”
And even though untangling partisan gridlock will require substantial empathy and effort from people across the political spectrum, understanding the communication environment is an important first step toward bridging the divide, Friedland adds.
Author: Sam Million-Weaver