Elon Musk's Grok 3 vs ChatGPT vs DeepSeek: Which is the Best AI Chatbot Present Today?

Can Grok 3, a newly-launched AI chatbot compete with its tested counterparts?

Elon Musk's xAI has finally released Grok 3, its new AI model, stating it outperforms the competition in the form of OpenAI, DeepSeek, and more. But does it deliver?

With new reasoning features and more computing power, Grok 3 has come a long way, but AI experts are still on the fence about its prospects. Let's find out how it stacks up against the top AI models available.

Is Grok 3 a Breakthrough in AI Reasoning?

Musk revealed the Grok 3 family in a live stream on X, introducing Grok 3 Reasoning (beta) and Grok 3 Mini Reasoning. Unlike conventional generative AI models, reason-based AI is able to "think" out problems, minimizing the potential for misinformation or hallucinations. This is an important milestone in enhancing AI reliability and accuracy.

xAI claims that Grok 3 outperforms OpenAI's o1 and DeepSeek-R1 in key benchmarks. In Chatbot Arena's blind testing, Grok 3—under its codename "chocolate"—ranked highly, proving it has caught up with industry leaders despite its late entry into the market.

Can ChatGPT be Dethroned?

AI pioneer Andrej Karpathy, an OpenAI founding member and former Tesla AI director, tested Grok 3 and shared his insights.

According to him, Grok 3 with its Deep Search reasoning feature is competitive with OpenAI's top-tier models like o1-pro ($200/month) and slightly outperforms DeepSeek-R1.

Despite this progress, Karpathy doesn't believe Grok 3 will be enough to make users cancel their ChatGPT subscriptions.

AI professor Ethan Mollick echoed this sentiment, stating, "Grok 3 came in right at expectations," adding that compute power and speed remain the key differentiators in AI development.

xAI's Benchmark Controversy: Did They Overhype Grok 3?

Grok 3's performance charts quickly went viral, showing it outperforming key competitors. However, OpenAI's Rex Asabor challenged these claims, sharing an "updated" chart that showed OpenAI's o3 model beating Grok 3 in math and science benchmarks.

While OpenAI's o3 is not yet publicly available, this comparison adds a layer of skepticism to xAI's claims.

In all fairness, xAI might not have been privy to the recent OpenAI benchmark scores when they ran their tests. Nevertheless, this is a reminder that the AI race is far from over.

Grok 3's Exponential Growth: A Tip of the Iceberg?

Despite some doubt, Grok 3's quick development is undeniable, Mashable writes. Google and OpenAI enjoyed years of head starts in development—13 and 8 years, respectively—while xAI came out in 2023. Nevertheless, Grok 3 has already joined the discussion as a leading AI model.

Musk also indicated that Grok 3 was trained with 10 times the compute capacity of Grok 2, using 200,000 GPUs. This reaffirms the general rule that increasing amounts of computing result in enhanced AI performance.

Researchers such as Gary Marcus do question, however, whether or not scaling up computing will further increase AI intelligence beyond a certain threshold.

Grok 3's Limitations: Still a Work in Progress

Though Grok 3 has come a long way, it still experiences many of the same issues as other AI models:

  • Weak Humor Abilities: Similar to many AI models, Grok 3 has poor skills at producing good humor, frequently resorting to bland dad jokes.
  • SVG Image Generation Challenges: AI models often struggle to position intricate visual elements. Grok 3 outperformed alternatives such as Gemini 1.5 Flash but still suffered from spatial relationship problems.
  • Political Bias Issues: Musk has framed Grok as an "anti-woke" counterpoint to AI models that have been criticized for political correctness. But Karpathy discovered that Grok 3 would not discuss some ethical challenges, possibly rendering it more "overly sensitive" than Musk's audience may wish.

Musk has noted in the past that earlier Grok models biased left because publicly available training material tends to bias left. Future updates, he has vowed, will render Grok politically more neutral.

What Is the Best Option?

Grok 3 is available first to X Premium+ subscribers, a plan that recently increased to $50 per month. While the model has made significant progress, it may not yet be enough to dethrone OpenAI's ChatGPT and DeepSeek, which is banned in Australia.

To fans of AI and Musk, Grok 3 is a welcome addition to the AI scene. But for anyone looking for the very best in AI models, Grok 3 may not be strong enough to make users switch from top industry players at this point in time.

In another review by Decrypt, Grok-3 allows more "free speech" than other AI chatbots. When it comes to coding, it "just works" better than others. However, in math reasoning, OpenAI and DeepSeek are still the better chatbots.

ⓒ 2024 TECHTIMES.com All rights reserved. Do not reproduce without permission.
Tags:ChatGPT
Join the Discussion
Real Time Analytics