Google DeepMind AI Falls to Humans as It Wins Silver at the International Math Olympiad

Artificial intelligence's mathematical prowess proves to be formidable, granted it is given the time it needs after Google DeepMind's AI model reportedly won silver in the International Mathematical Olympiad (IMO) this week.

The two systems, AlphaProof and AlphaGeometry 2, worked together to solve six difficult problems from the esteemed competition. Remarkably, the combined system achieved 28 points out of a potential 42, one point less than the gold standard, by solving four out of six issues.

Gemini Chatbot Can Soon Be More Effective in Multitasking With this Feature — Google is teasing to bring a multitasking functionality for Gemini AI chatbot. For those who love doing multiple tasks at once, this might be the answer to your concerns. Google DeepMind from Unsplash

(Photo : Google DeepMind from Unsplash)

Google is teasing to bring a multitasking functionality for Gemini AI chatbot. For those who love doing multiple tasks at once, this might be the answer to your concerns.

Remarkably, the AI solved the hardest task in the competition that only five human competitors were able to solve and received a perfect score. With this astounding accomplishment, DeepMind's AI is now ranked among the world's best young mathematicians.

Different methods were employed by the two systems. Combining reinforcement learning with a language model, AlphaProof solved two algebraic problems and one number theory problem. It writes verifiable mathematical proofs into programs using "formal mathematics," which enables the system to grow and learn.

However, AlphaGeometry 2 concentrated on answering geometry-related queries and achieved an amazing 16-second solution time. Its response demonstrated the AI's capacity for original thought by taking a novel technique that even startled human specialists.

Google DeepMind's AI Takes its Time

The AI performed well in certain domains but poorly in others. Two out of the six questions did not yield any progress from the systems. Furthermore, it took Google's AI systems anything from minutes to up to three days to address the problems.

For comparison, DeepMind's AI took three days to tackle one especially challenging challenge, but human opponents had a nine-hour time constraint.

Prof. Gowers made numerous significant limitations while conceding that the result was well above what automatic theorem provers had previously been able to perform.

Notwithstanding the drawbacks, Google DeepMind's accomplishment is said to mark a substantial advancement in AI's capacity for mathematical thinking.

The creation of AI systems capable of solving challenging mathematical puzzles may have profound effects on a range of industries, including education and science.

AI's Mathematical Capabilities

An increasing number of AI models are being enhanced to improve their mathematical and other capabilities. As a result, ChatGPT received a major update in April, in which OpenAI explicitly targeted the GPT-4 Turbo model available to ChatGPT Plus, Team, or Enterprise members.

Several enhancements were made to ChatGPT's writing, math, logical thinking, and coding skills at the time of the update.

OpenAI claims that the update significantly advances mathematics and GPQA (Graduate-Level Google-Proof Q&A), a performance benchmark that evaluates answers to multiple-choice questions in a variety of scientific fields. OpenAI's graphical representations show significant improvements in several domains.

"More direct" and "less verbose," the revised ChatGPT is said to respond conversationally. OpenAI emphasizes how language has been refined to create a more human-like experience while interacting with ChatGPT. These improvements are meant to facilitate communication and give consumers access to pertinent and succinct information.