Human reinforced learning could mean 'more truthful and less toxic' AI

Cortana in the Halo TV series
(Image credit: Paramount)

AI has been making huge leaps in terms of scientific research, and companies like Nvidia and Meta are continuing to throw more resources towards the technology. But AI learning can have a pretty huge setback when it adopts the prejudices of those who make it. Like all those chatbots that wind up spewing hate speech thanks to their exposure to the criminally online.

According to Golem, the OpenAI might have made some headway on that with its new successor to the GPT-3, the autoregressive language model that uses deep learning in an effort to appear human in text. It wrote this article, if you want an example of how that works.

Tips and advice

The Nvidia RTX 3070 and AMD RX 6700 XT side by side on a colourful background

(Image credit: Future)

How to buy a graphics card: tips on buying a graphics card in the barren silicon landscape that is 2021

But GPT-3 also has a tendency to parrot incorrect, biased, or outright toxic notions thanks to all the sources of information. These biases would impact the language, causing GPT-3 to make bigoted assumptions or implications in its writing. It’s not too different from humans, in that all these reinforced ideas can easily look like truths, and there’s plenty of outdated notions to choose from. GPT-3 seems a bit like the weird Uncle you don’t talk to on Facebook.

The new InstructGPT is said to be an improvement as its answers are "more truthful and less toxic”. This has been achieved thanks to the work of researchers at Open AI, who’s alignment research helps the machine process instructions more accurately, despite being much smaller. InstructGPT uses 1.3 billion parameters, which is a fraction of the 175 billion used by the older GPT-3 model but thanks to reinforcement learning with human feedback, has simply been better trained. The quality of InstructGPT’s answers are assessed and reported on by researchers, hopefully shaping it to be a better bot overall.

That being said, though InstructGPT seems like a promising step up, it's still far from perfect. "They still generate toxic or biased results, fabricate facts and generate sexual and violent content without explicit request" according to the researchers at OpenAI but it’s still less than the older GPT-3. Perhaps in a few generations we’ll see a language AI that’s a bit further unravelled from some of the worst aspects of humanity. 

Hope Corrigan
Hardware Writer

Hope’s been writing about games for about a decade, starting out way back when on the Australian Nintendo fan site Vooks.net. Since then, she’s talked far too much about games and tech for publications such as Techlife, Byteside, IGN, and GameSpot. Of course there’s also here at PC Gamer, where she gets to indulge her inner hardware nerd with news and reviews. You can usually find Hope fawning over some art, tech, or likely a wonderful combination of them both and where relevant she’ll share them with you here. When she’s not writing about the amazing creations of others, she’s working on what she hopes will one day be her own. You can find her fictional chill out ambient far future sci-fi radio show/album/listening experience podcast right here. No, she’s not kidding. 

Read more
Closeup of the new Copilot key coming to Windows 11 PC keyboards
Microsoft co-authored paper suggests the regular use of gen-AI can leave users with a 'diminished skill for independent problem-solving' and at least one AI model seems to agree
CHINA - 2025/02/11: In this photo illustration, a Roblox logo is seen displayed on the screen of a smartphone. (Photo Illustration by Sheldon Cooper/SOPA Images/LightRocket via Getty Images)
'Humans still surpass machines': Roblox has been using a machine learning voice chat moderation system for a year, but in some cases you just can't beat real people
The OpenAI logo is being displayed on a smartphone with an AI brain visible in the background, in this photo illustration taken in Brussels, Belgium, on January 2, 2024. (Photo illustration by Jonathan Raa/NurPhoto via Getty Images)
OpenAI is working on a new AI model Sam Altman says is ‘good at creative writing’ but to me it reads like a 15-year-old's journal
OpenAI logo displayed on a phone screen and ChatGPT website displayed on a laptop screen are seen in this illustration photo taken in Krakow, Poland on December 5, 2022.
New research says ChatGPT likely consumes '10 times less' energy than we initially thought, making it about the same as Google search
SAN FRANCISCO, CALIFORNIA - NOVEMBER 06: OpenAI CEO Sam Altman speaks during the OpenAI DevDay event on November 06, 2023 in San Francisco, California. Altman delivered the keynote address at the first-ever Open AI DevDay conference.(Photo by Justin Sullivan/Getty Images)
In a mere decade 'everyone on Earth will be capable of accomplishing more than the most impactful person can today' says OpenAI boss Sam Altman
A digitally generated image of abstract AI chat speech bubbles overlaying a blue digital surface.
We need a better name for AI, or we risk talking past each other until actually intelligent AGI comes home mooing
Latest in AI
Otter AI Meeting Agent
As if your work meetings weren't already fun enough, now Otter has a new all-hearing AI agent that remembers everything anyone has said and can join in the discussion
Image for
'No real human would go four links deep into a maze of AI-generated nonsense': Cloudflare's AI Labyrinth uses decoy pages to trap web-crawling bots and feed them slop 'as a defensive weapon'
CHINA - 2025/02/11: In this photo illustration, a Roblox logo is seen displayed on the screen of a smartphone. (Photo Illustration by Sheldon Cooper/SOPA Images/LightRocket via Getty Images)
'Humans still surpass machines': Roblox has been using a machine learning voice chat moderation system for a year, but in some cases you just can't beat real people
OpenAI logo displayed on a phone screen and ChatGPT website displayed on a laptop screen are seen in this illustration photo taken in Krakow, Poland on December 5, 2022.
ChatGPT faces legal complaint after a user inputted their own name and found it accused them of made-up crimes
Public Eye trailer still - dead-eyed police officer sitting for an interview
I'm creeped out by this trailer for a generative AI game about people using an AI-powered app to solve violent crimes in the year 2028 that somehow isn't a cautionary tale
Closeup of the new Copilot key coming to Windows 11 PC keyboards
Microsoft co-authored paper suggests the regular use of gen-AI can leave users with a 'diminished skill for independent problem-solving' and at least one AI model seems to agree
Latest in News
Lara Croft Unified Art
Tomb Raider developer Crystal Dynamics lays off 17 employees 'to better align our current business needs and the studio's future success'
A long bendy arm stealing money from people in a subway car
'You're a very long arm. You steal things. It's a comedy game,' explains developer of comedy game where you steal things with a very long arm
The heroes are attacked by monsters
Pillars of Eternity is getting turn-based combat to mark its 10th anniversary, and that means PC Gamer editors will soon be arguing about combat mechanics again
Image of Ronaldo from Fatal Fury: City of the Wolves trailer
It doesn't really make sense that soccer star Ronaldo is now a Fatal Fury character, but if you follow the money you can see how it happened
Junah beginning a battle in Metaphor: ReFantazio.
Today's RPG fans are 'very sensitive to feeling like they wasted time' when they die, says Metaphor: ReFantazio battle planner—but Atlus still made combat hard anyway
Image of Cersei Lanniser from Game of Thrones: Kingsroad Steam early access trailer
A new Game of Thrones RPG is coming to Steam today with a cast of 'familiar faces,' which is good because it's really the only way to tell it's a GoT game at all