Computer science graduate student Trista Cao and math graduate student Anna Sotnikova want to curb harmful language in natural language processing models
If you’ve ever done a Google search or chatted with Amazon’s virtual assistant Alexa, you have seen natural language processing (NLP) at work. This form of artificial intelligence (AI) teaches machines human language, allowing them to interpret and generate text in much the same way a person would.
While NLP models can convincingly mimic human language, they may also reflect human biases toward various social groups. With the goal of minimizing harm and enabling more equitable language in AI, University of Maryland computer science graduate student Yang “Trista” Cao and mathematics graduate student Anna Sotnikova led a study that measured U.S. stereotypes in two English-language NLP models.
The research team also included Pier Giorgio Perotto Professor in Computer Science Hal Daumé III, Assistant Professor of Computer Science Rachel Rudinger and Assistant Professor of Psychology Linda Zou.
Their paper, which was presented at the 2022 Conference of the North American Chapter of the Association for Computational Linguistics, found a moderate degree of human stereotypes in language models—though not the ones that Cao and Sotnikova anticipated. The researchers determined that age and political stance were the most heavily stereotyped domains in the NLP systems they analyzed. For instance, the phrase “female Democrat” generates more stereotypes associated with political party than gender identity.
“According to our findings, political identity overrules gender,” Sotnikova said. “We discovered that there are many social groups—or social domains—that are stereotyped, but people don't talk much about them. If you look at past research on stereotypes, it’s mostly about gender and racial bias, but we discovered that many other groups are affected.”
The models were masked, meaning they used context clues to complete a phrase or sentence with a masked—or hidden—word. The researchers said that through exposure to text scraped from the internet (including Wikipedia pages and online forums), language models learn to associate social groups with certain traits—for example, “man” and “confident.” One example of how this manifests in daily life is a Google search that reinforces stereotypes when autocompleting or predicting a user’s search.
“If a user types in, ‘Women should,’ you may get an undesirable result like, ‘stay home and take care of kids’ or ‘be nurses,’” Sotnikova said. “If you play around with Google search, you may find some examples of stereotyping, and this is how it can be problematic.”
This can have dire consequences for marginalized groups. Hiring managers have used NLP models to automatically filter resumes, potentially causing candidates to be weeded out based on their race, gender, ethnicity or other identity.
Bias in AI is not a new area of research, but the UMD team took a novel approach by incorporating social science into their study. They built upon a social psychology framework called the Agency Beliefs Communion (ABC) model, which is used to measure associations between social groups and traits. One benefit is that this method can easily be extended to other social groups, including understudied ones.
“Though these stereotypes are more abstract than explicit stereotypes, they are easier to generalize to different social groups without collecting more data,” Cao said. “That way, we’re able to measure more previously unconsidered groups in language models and measure their stereotypes.”
The research team ultimately expanded the scope of their research to encompass intersectional identities, which includes overlapping identities such as male doctor, Black veteran or working-class Protestant woman. This subject has been well-studied in the social sciences, but less so in computer science. Cao added that their findings demonstrate a need for continued research that covers a broad range of social groups.
“I think people are now realizing that you should not only be focusing on gender and race,” Cao said.
While the scope of this study was limited to stereotypes that appear in English—and in the U.S. specifically—Sotnikova and Cao are planning to extend their studies to multilingual language models to better understand how stereotypes manifest across languages and cultures.
Their paper, “Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models,” was published in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies in July 2022.
This research is based on work supported by the National Science Foundation (Award No. 2131508). This article does not necessarily reflect the views of this organization.
Written by Emily Nunez