Reengineering Bias: What an Interesting Idea
April 5, 2021
If this is true, AI may be in trouble now. VentureBeat reports, “Researchers Find that Debiasing Doesn’t Eliminate Racism from Hate Speech Detection Models.” It is known that AI systems meant to detect toxic language themselves have a problem with bias. Specifically, they tend to flag text by Black users more often than text by white users. Oh, the irony. The AI gets hung up on language markers often found in vernaculars like African-American English (AAE). See the article for a few examples. Researchers at the Allen Institute tried several techniques to reteach existing systems to be more even handed. Reporter Kyle Wiggers writes:
“In the course of their work, the researchers looked at one debiasing method designed to tackle ‘predefined biases’ (e.g., lexical and dialectal). They also explored a process that filters ‘easy’ training examples with correlations that might mislead a hate speech detection model. According to the researchers, both approaches face challenges in mitigating biases from a model trained on a biased dataset for toxic language detection. In their experiments, while filtering reduced bias in the data, models trained on filtered datasets still picked up lexical and dialectal biases. Even ‘debiased’ models disproportionately flagged text in certain snippets as toxic. Perhaps more discouragingly, mitigating dialectal bias didn’t appear to change a model’s propensity to label text by Black authors as more toxic than white authors. In the interest of thoroughness, the researchers embarked on a proof-of-concept study involving relabeling examples of supposedly toxic text whose translations from AAE to ‘white-aligned English’ were deemed nontoxic. They used OpenAI’s GPT-3 to perform the translations and create a synthetic dataset — a dataset, they say, that resulted in a model less prone to dialectal and racial biases.”
The researchers acknowledge that re-writing Black users’ posts to sound more white is not a viable solution. The real fix would be to expose AI systems to a wider variety of dialects in the original training phase, but will developers take the trouble? Like many people, once hate-speech detection bots become prejudiced it is nigh impossible to train them out of it.
Cynthia Murrell, April 5, 2021