Google's AI solved four out of six problems in one of the world's hardest maths competitions, equivalent to a silver medal standard 'in a certain sense'

AI concept. 3D render
(Image credit: BlackJack 3D via Getty Images)

The International Mathematical Olympiad is not just a terrifying sequence of words for someone as maths-blind as myself, but also a notoriously challenging world championship mathematics competition for high school students from over 100 different countries. Each year students compete to show off their mathematical prowess in a chosen host country, each aiming to solve problems that would make the rest of us cower in fear.

Google DeepMind has announced that two of its AI systems, AlphaProof and AlphaGeometry 2, took on this year's contest questions as a combined system. The AI had its solutions scored by previous gold-medalist winners Professor Sir Timothy Gowers and Dr Joseph Myers, the latter of which is Chair of the IMO 2024 Problem Selection Committee itself. 

Not only did the AI chalk up a combined score of 28 out of 42, one point off the 29 required for a gold medal, but also achieved a perfect score on the competition's hardest problem (via Ars Technica). Just as well really, as two combinatorics problems remained unsolved. Still, stick to what you're good at, ey?

There's a slight fly in the ointment, however. In a Twitter thread, Prof Sir Timothy Gowers points out that while the AI did indeed score higher than most, it needed a lot longer than human competitors to do so. Human candidates submit their answers in two four-and-a-half-hour sessions—and while one problem was solved by the AI within minutes, it took up to three days to solve the others.

"If the human competitors had been allowed that sort of time per problem they would undoubtedly have scored higher," wrote Sir Gowers.

"Nevertheless, (i) this is well beyond what automatic theorem provers could do before, and (ii) these times are likely to come down as efficiency gains are made."

Not only that, but it's not like the AI sat down in front of a test paper and began chewing on its pencil. The problems were manually translated into Lean, a proof assistant and programming language, so the autoformalization of the questions was carried out by old-fashioned humans.

Still, as the good Professor points out, what the AI has achieved here is a lot more involved and nuanced than simply brute forcing the problems:

"We might be close to having a program that would enable mathematicians to get answers to a wide range of questions, provided those questions weren't *too* difficult—the kind of thing one can do in a couple of hours."

"Are we close to the point where mathematicians are redundant? It's hard to say. I would guess that we're still a breakthrough or two short of that." 

Best CPU for gamingBest gaming motherboardBest graphics cardBest SSD for gaming


Best CPU for gaming: Top chips from Intel and AMD.
Best gaming motherboard: The right boards.
Best graphics card: Your perfect pixel-pusher awaits.
Best SSD for gaming: Get into the game first.

Andy Edser
Hardware Writer

Andy built his first gaming PC at the tender age of 12, when IDE cables were a thing and high resolution wasn't. After spending over 15 years in the production industry overseeing a variety of live and recorded projects, he started writing his own PC hardware blog in the hope that people might send him things. And they did! Now working as a hardware writer for PC Gamer, Andy's been jumping around the world attending product launches and trade shows, all the while reviewing every bit of PC hardware he can get his hands on. You name it, if it's interesting hardware he'll write words about it, with opinions and everything.