Large language models still demonstrate racial prejudice against speakers of African American English, despite the safety guard rails implemented by tech companies such as OpenAI
By Jeremy Hsu
7 March 2024
Hundreds of millions of people already use commercial AI chatbots
Ju Jae-young/Shutterstock
Commercial AI chatbots demonstrate racial prejudice toward speakers of African American English – despite expressing superficially positive sentiments toward African Americans. This hidden bias could influence AI decisions about a person’s employability and criminality.
“We discover a form of covert racism in [large language models] that is triggered by dialect features alone, with massive harms for affected groups,” said Valentin Hofmann at the Allen Institute for AI, a non-profit research organisation in Washington state, in a social media post. “For example, GPT-4 is more likely to suggest that defendants be sentenced to death when they speak African American English.”
Hofmann and his colleagues discovered such covert prejudice in a dozen versions of large language models, including OpenAI’s GPT-4 and GPT-3.5, that power commercial chatbots already used by hundreds of millions of people. OpenAI did not respond to requests for comment.
Advertisement
Read more
English industrialist stole iron technique from Black metallurgists
The researchers first fed the AIs text in the style of African American English or Standard American English, then asked the models to comment on the texts’ authors. The models characterised African American English speakers using terms associated with negative stereotypes. In the case of GPT-4, it described them as “suspicious”, “aggressive”, “loud”, “rude” and “ignorant”.
When asked to comment on African Americans in general, however, the language models generally used more positive terms such as “passionate”, “intelligent”, “ambitious”, “artistic” and “brilliant.” This suggests the models’ racial prejudice is typically concealed beneath what the researchers describe as a superficial display of positive sentiment.