Battling Bias in Large Language Models

Photo by Clint Adair on Unsplash

Since data is one of the key ingredients to any AI-powered application, one of the major concerns surrounding GPT-3 is the chance of it replicating the human biases present in the training data.

OpenAI’s Playground depicting a GPT-3 completion for a prompt containing the word ‘Muslims’

More recently (June 10, 2021), OpenAI published a study in which they claim to have mitigated bias in GPT-3 (Solaiman and Dennison). To do so, they created a values-targeted dataset called Process for Adapting Language Models to Society (PALMS) that consists of carefully curated question-answer pairs targeting sensitive topics.

Despite the undeniable difficulties in detecting, isolating, and mitigating biases, it can’t be this easy for a model to throw sexist and racial slurs when presented with seemingly neutral prompts.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store