This AI tool creates singing, rapping, talking avatars from a single image and even the Mona Lisa isn't safe from spitting bars

Image of an AI generated performance of Shakespeare given by the Mona Lisa
(Image credit: Institute for Intelligent Computing, Alibaba Group)

Remember that late-night talk show bit where an image of a political figure is shown with someone else's mouth superimposed on top, in order to make them say dubious things? It always looked a little ropey, but that was part of the effect. Well, this new AI tool also takes still images of human subjects and animates the mouth and head movements, but this time the effect is surprisingly, almost worryingly convincing.

The tool is called EMO: Emote Portrait Alive, and it's been developed by several researchers from the Institute for Intelligent Computing, part of the Alibaba Group. The tool takes a single reference image, extracts generated motion frames, and then combines them with vocal audio through a complex diffusion process in which the facial region is integrated with multi-frame noise samples and then de-noised while adding generated imagery to synch with the audio, eventually generating a video of the subject not only lip-synching, but also emoting various facial expressions and head poses.

The technology is demonstrated using sample images of various figures ranging from real-life celebrities, to AI generated people, to the Mona Lisa, while the vocal audio used includes a Dua Lipa track, pre-recorded interview clips, and Shakespearian monologues. After the process has been applied the generated avatar appears to have come to life, mouthing and moving to the chosen audio.

The effect is surprisingly accurate, although it has to be said, far from perfect. "Buh" sounds sometimes appear to come from open mouths rather than closed lips, and the occasional syllable appears from clenched teeth, as if the avatar is resisting the AI's insistence on bringing them to life to sing and perform for the internet.

Still, it's a remarkable effect, and one that's likely to pass without notice from a casual observer unless they were told specifically to watch out for mouth movements and timing.

Thinking of upgrading?

Windows 11 Square logo

(Image credit: Microsoft)

Windows 11 review: What we think of the latest OS.
How to install Windows 11: Our guide to a secure install.
Windows 11 TPM requirement: Strict OS security.

Even more impressive is a later demonstration of what the company refers to as "cross-actor performance". A clip shows Joaquin Phoenix in full make-up as the Joker, except this time with the audio of Heath Ledger's interpretation of the character from The Dark Knight, including a reasonable approximation of Ledger's trademark swallowing and lip smacking in the role.

While the technology is undoubtedly impressive, it's likely to do little to dissuade the creeping notion that AI deepfake content, and all the nefarious purposes it can be potentially used for, is progressing at a remarkable rate. 

While these videos make for excellent tech demonstrations, they are reminders that the difference between what we presume is real and what is computer generated is rapidly becoming harder to spot as image and video generation technology matures. AI tools can sometimes demonstrate a terrifying ability to churn out generated content at an incredible rate and with increasing complexity, and that has some troubling implications. Although perhaps that's just me being a big old worrywart.

Will it not be long, I wonder, before our holiday snaps can be grabbed from our long defunct Facebook pages, to be turned by AI tools into videos of us mouthing songs we never sang? At least, that's my excuse. 

No, I did not drunkenly attempt karaoke in Cyprus. It's an AI-enhanced fake, that one, I promise. 

Andy Edser
Hardware Writer

Andy built his first gaming PC at the tender age of 12, when IDE cables were a thing and high resolution wasn't—and he hasn't stopped since. Now working as a hardware writer for PC Gamer, Andy's been jumping around the world attending product launches and trade shows, all the while reviewing every bit of PC hardware he can get his hands on. You name it, if it's interesting hardware he'll write words about it, with opinions and everything.

Read more
A image representing a typical YouTube tech video thumbnail using joke elements to demonstrate the use of an AI tool
Is time too precious to waste making gurning thumbnails for your YouTube videos? Huzzah for this AI tool that does it all for you, then
Aloy
'Creepy,' 'ghastly,' 'rancid': Viewers react to leaked video of Sony's AI-powered Aloy
One YouTuber has been poisoning AI tools that access her videos with .ass subtitle files and you can too
An Ai face looks down on a human.
Xbox announces 'a generative AI model for gameplay ideation' called Muse, but don't get too excited: Machines aren't about to make games for you just yet
CHONGQING, CHINA - OCTOBER 30: In this photo illustration - The Facebook app page is displayed on a smartphone in the Apple App Store in front of the Meta Platforms, inc. logo on October 30, 2024 in Chongqing, China. (Photo by Cheng Xin/Getty Images)
Meta might've done something useful, pioneering an AI model that can interpret brain activity into sentences with 80% accuracy
The OpenAI logo is being displayed on a smartphone with an AI brain visible in the background, in this photo illustration taken in Brussels, Belgium, on January 2, 2024. (Photo illustration by Jonathan Raa/NurPhoto via Getty Images)
OpenAI is working on a new AI model Sam Altman says is ‘good at creative writing’ but to me it reads like a 15-year-old's journal
Latest in AI
Otter AI Meeting Agent
As if your work meetings weren't already fun enough, now Otter has a new all-hearing AI agent that remembers everything anyone has said and can join in the discussion
Image for
'No real human would go four links deep into a maze of AI-generated nonsense': Cloudflare's AI Labyrinth uses decoy pages to trap web-crawling bots and feed them slop 'as a defensive weapon'
CHINA - 2025/02/11: In this photo illustration, a Roblox logo is seen displayed on the screen of a smartphone. (Photo Illustration by Sheldon Cooper/SOPA Images/LightRocket via Getty Images)
'Humans still surpass machines': Roblox has been using a machine learning voice chat moderation system for a year, but in some cases you just can't beat real people
OpenAI logo displayed on a phone screen and ChatGPT website displayed on a laptop screen are seen in this illustration photo taken in Krakow, Poland on December 5, 2022.
ChatGPT faces legal complaint after a user inputted their own name and found it accused them of made-up crimes
Public Eye trailer still - dead-eyed police officer sitting for an interview
I'm creeped out by this trailer for a generative AI game about people using an AI-powered app to solve violent crimes in the year 2028 that somehow isn't a cautionary tale
Closeup of the new Copilot key coming to Windows 11 PC keyboards
Microsoft co-authored paper suggests the regular use of gen-AI can leave users with a 'diminished skill for independent problem-solving' and at least one AI model seems to agree
Latest in News
The heroes are attacked by monsters
Pillars of Eternity is getting turn-based combat to mark its 10th anniversary, and that means PC Gamer editors will soon be arguing about combat mechanics again
Image of Ronaldo from Fatal Fury: City of the Wolves trailer
It doesn't really make sense that soccer star Ronaldo is now a Fatal Fury character, but if you follow the money you can see how it happened
Junah beginning a battle in Metaphor: ReFantazio.
Today's RPG fans are 'very sensitive to feeling like they wasted time' when they die, says Metaphor: ReFantazio battle planner—but Atlus still made combat hard anyway
Image of Cersei Lanniser from Game of Thrones: Kingsroad Steam early access trailer
A new Game of Thrones RPG is coming to Steam today with a cast of 'familiar faces,' which is good because it's really the only way to tell it's a GoT game at all
The new Prime Asset featured in the upcoming update for the Outlast Trials.
The Outlast Trials puts its already paranoid players under surveillance for a time-limited story event
A Viera looking confused in Final Fantasy 14.
Old armor continues to fall victim to Final Fantasy 14's bizarre two-channel dye system, unless you're super into changing the colour of teeny-tiny eyelets: 'Why even bother at this point?'