
According to Apple researchers, the race to develop artificial intelligence (AGI) still has a long way to go, and they find leading AI models still difficult to reason about.
Latest updates to Lead AI Large Language Model (LLM), such as Openai’s Chatgpt and Claude of humanity An Apple researchers said in June Paper It is called “the fantasy of thinking”.
They noted that the current assessment focused primarily on established mathematical and coding benchmarks, “emphasizing the accuracy of the final answer.”
However, such evaluation does not provide insight into the reasoning capabilities of AI models.
Research and expect That kind of artificial general intelligence is only a few years away from the hotel.
Apple researchers test “thinking” AI models
The researchers designed different puzzle games to test “thinking” and “thoughtless” variants of Claude Sonnet, Openai’s O3-Mini and O1, as well as DeepSeek-R1 and V3 Chatbots that exceed standard mathematical benchmarks.
They found that “Border LRMs face a collapse of complete accuracy beyond some complexity” and do not reason effectively, and their edges disappear with expectations of AGI capabilities.
“We found that LRMs have limitations in precise calculations: they cannot use clear algorithms and cause inconsistencies across puzzles.”
Researchers say
They found inconsistent and shallow reasoning with the model and observed overthinking, AI chatbots generated the correct answers early and then lingered in incorrect reasoning.
Related: AI solidifies characters in Web3, challenging Defi and games: Dappradar
The researchers concluded that LRMS simulates inference patterns without really internalizing or generalizing them, and these patterns lack Agi-level inference.
“These insights challenge the main assumptions about LRM functionality and suggest that current approaches may encounter fundamental barriers to generalizable reasoning.”
AGI competition
Agi is the Holy Grail Artificial Intelligence Developmentmachines can think and reason like humans, and are comparable to human intelligence.
In January, Openai CEO Sam Altman explain The company is closer than ever to building AGI. “We are now confident that we know how to build the AGI we traditionally understand,” he said.
In November, Human CEO Dario Amodei explain The AGI will exceed human capabilities in the next two years. “If you just watch how fast these features are increasing, it really makes you think we can get there by 2026 or 2027,” he said.
Magazine: Ignore AI Job Destroyers, AI Is Good for Employment Say, PWC: AI EYE