Google’s Gemini panicked when playing Pokémon


AI -companies are fighting to rule the industry, but sometimes they also fight in Pokémon gyms.

As Google And Anthropic Both study how their latest AI models navigate early Pokémon -games, the results can be as fun as they light up -and this -time Google Deepmind has Written in a report That Gemini 2.5 Pro recurre to panic when its Pokémon is close to death. This can cause AI’s performance to undergo “qualitatively observable degeneration in the reasoning capacity of the model,” according to the report.

Ai -benchmarking – or, the process of comparing the performance of different AI models – is a dubious art that often provides A small context for the actual abilities of a given model. But some researchers think that Studying how AI models play video games could be Useful (or, at least very funny).

Over the past several months, two developers unaffected with Google and Antropic have set up respective Twitch streams called “Twins play Pokémon“And”Claude plays Pokémon“Where anyone can watch in real time as AI tries to navigate a children’s video game more than twenty -five years ago.

Each stream shows the process of “reasoning” of AI – or, a natural language translation of how the AI ​​assesses a problem and arrives in response – giving us an understanding of the way these models work.

Image Credits:Google

Although the progress of these AI models is impressive, they are still not very good to play Pokémon. It takes hundreds of hours for twins to reason through a game that a child could perform in exponentially less time.

What is interesting about watching AI -Navigate Pokémon -game is not so about its completion time, but rather, as it behaves along the way.

“During the play, Gemini 2.5 Pro enters various situations that cause the model to simulate” panic “,” the report says.

This state of “panic” can result in the performance of the model to worsen, as the AI ​​may suddenly stop using certain tools at its disposal for stress. While AI does not think or experience emotion, its actions mimic the way a person could make poor, quick decisions when under stress – a fascinating, yet restless response.

“This behavior has occurred in enough separate cases that the members of the Twitch chat actively noted when it happens,” the report says.

Claude also exhibited some curious behaviors in his travels through a song. On one occasion, the AI ​​picked up the template that when all its Pokémon runs out, the player character “Blankos” and will return to Pokémon Center.

When Claude stepped into the mountain -a cave, it mistakenly hypothesized that if it deliberately reached its entire Pokemon, then it would be transported across the cave to the center of Pokémon in the next town.

However, that doesn’t work as the game. When all your Pokémon die, you return to any Pokémon center you used most recently, instead of the closest geographically. Viewers looked in horror as the AI ​​essentially tried to kill themselves in the game.

Despite its shortcomings, there are some ways as the AI ​​can overcome human players. From the release of Gemini 2.5 Pro, the AI ​​is capable of solving puzzles with impressive accuracy.

With some human help, the AI ​​created action tools – prompted Cases of Gemini 2.5 due to oriented to specific tasks – to solve the puzzles of the game and find effective routes to reach a destination.

“With just a prompt describing Boulder physics and a description of how to control a valid path, Gemini 2.5 Pro you can one-shoot some of these complex Boulder puzzles that are needed
Progress through Victory Road, “the report says.

Because Gemini 2.5 Pro has done a lot of work to create these tools on its own, Google theorizes that the current model may be able to create these tools without human intervention. Who knows, maybe Gemini will be treated to create a “not panic” module.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *