Baldur's Gate 3 tool discovers there's around 1,888 characters with dialogue in the game—though 'justice for Karlach was actually the main reason' it was made
That's a lotta mouth-flappin'.
Baldur's Gate 3 has a lot of words—and even 'a lot of words' is a major understatement. In a Steam post before the game's release, it was revealed the game's total script is about 2 million words long. For context, all five books in the current Game of Thrones series add up to about 1.7 million words. Big. It's a big game.
Which is why I was pretty damn impressed to find this tool casually popping up on the game's subreddit, able to show which character had the most changes to their dialogue since launch. It's Wyll, which is interesting—but not a huge surprise, seeing as his story stands the most to gain from some added nattering (we still like him, though). Still, I wanted to know how the heck something like this was built, so I reached out to the tool's creator.
Total amount of changes & added lines per character from r/BaldursGate3
They go by the name of Invuska on Reddit, GitHub, the Larian Forums and Discord, and they credit the BG3 Patch Dialogue Difference Tool's existence to a shared effort by other modders in the community. "The extractor (by Norbyte), multi-tool (ShinyHobo), dialog parser (roksik-dnd & anonymous collaborator), and the dialog difference tool (me)—all of the prior work is what made development of this tool (and many others) manageable."
Baldur's Gate 3 guide: Everything you need
Baldur's Gate 3 tips: Be prepared
Baldur's Gate 3 classes: Which to choose
Baldur's Gate 3 multiclass builds: Coolest combos
Baldur's Gate 3 romance: Who to pursue
Baldur's Gate 3 co-op: How multiplayer works
While Invuska mentions that without the collaborative effort this thing could've been easily "twice the amount of work", they've also got some compliments for Larian Studios itself. "Each line contained 'character codes' for which line was associated with which character and was structured in a way that I could fairly easily pick it apart … a data scientist loves nothing more than already very well structured and clean data to work with."
As for their own personal observations, Invuska's only just finished their first playthrough, which means they haven't been diving too deep into the script beyond a broad, numbers-based overview. Instead, they've been staggered—again—by how mammoth of a game Baldur's Gate 3 is.
"There are approximately [over] 1,888 characters with dialog in the game, even more considering some dialog may be misattributed and that this count doesn't include generic dialog (e.g. generic group of goblins). I definitely have not talked to 1,888 characters."
They also have a pretty good idea of how many lines—which could be multi-sentenced—the game has. "From what the internal code of the tool gathers there are 114,921 lines [in Patch 5]," compared to "110,869 on launch day." While the tool does highlight a ton of typo fixes, as Invuska mentions: "It's easy to think from the difference tool that there are a lot of typos in the script, but notice how in-game you don't even see them! That just goes to show how massive this game is."
The biggest gaming news, reviews and hardware deals
Keep up to date with the most important stories and the best deals, as picked by the PC Gamer team.
As for why Invuska would put this all together, that's down to one simple reason: justice for our big lady. "Justice for Karlach was actually the main reason why the tool was created, with more primitive code being created sometime in September after Patch 2 … many of us on Reddit, Discord, and in the Larian Forums thread for Karlach were/are quite hungry for an Infernal Engine fix of some sort that didn't necessitate her becoming a mindflayer or her having to go back to the Hells."
This means the tool started out targeting one specific character, then expanded to the whole cast: "I started working on simpler versions of the tool to satiate some of my curiosity/anticipation. A few others seemed to share the same curiosity and were interested in its development. Seeing how this tool may be useful for characters outside of just Karlach, I fleshed out my small collection of scripts for a more 'everyone-ready' version that you see today."
I live for this stuff. While some might take a dim view of data mining, it's clear that stats wizardry has a lot in common with speedrunning communities. Neither is trying to 'break' a game—instead, finding all the hidden secrets in between lines of code.
It's an expression of love, kinda like how you might wear out your favourite bit of hardware. In regards to the tool itself, Invuska's happy to share. "[I'm] planning to create more mods and tools in the future, so stay tuned. Also, the tool is open source on an MIT licence for anyone who is interested in forking/extending/etc. Go wild."
Harvey's history with games started when he first begged his parents for a World of Warcraft subscription aged 12, though he's since been cursed with Final Fantasy 14-brain and a huge crush on G'raha Tia. He made his start as a freelancer, writing for websites like Techradar, The Escapist, Dicebreaker, The Gamer, Into the Spine—and of course, PC Gamer. He'll sink his teeth into anything that looks interesting, though he has a soft spot for RPGs, soulslikes, roguelikes, deckbuilders, MMOs, and weird indie titles. He also plays a shelf load of TTRPGs in his offline time. Don't ask him what his favourite system is, he has too many.