Google just taught its AI how to use the internet — like, actually use it. We’re talking clicking, scrolling, filling out forms — the whole deal.

It’s called Gemini 2.5 Computer Use, and it can literally navigate a website the same way you or I would.

Here’s the rundown:

It uses visual reasoning to look at a webpage, figure out what needs doing, and then… just does it.

It can fill out and submit forms, test user interfaces, or even buy stuff online. And it doesn’t need any fancy API access — it’s literally working inside the same browser environment as humans.

Google’s already showing demos where Gemini plays 2048, scrolls through Hacker News, and completes tasks at lightning speed.

The only limit right now? It’s browser-only.

No deep computer control yet — just 13 core actions like clicking, typing, and dragging. But Google says it already outperforms competitors on major web and mobile benchmarks.

And of course, the timing couldn’t be better. 

This drop came right after OpenAI’s Dev Day — (the one with new ChatGPT apps and agent features) — and right on the heels of Anthropic’s computer-use updates.

Clearly, the AI race isn’t slowing down.

And as if that wasn’t enough, Google also announced Gemini CLI Extensions — a new system that lets developers integrate third-party systems or apps into its AI environment.

The first plugin introduced, is a link to Google’s Nanobanana image generator — so devs can literally create images straight from the terminal.

The wild part? Google’s keeping it open-source. Anyone can publish extensions on GitHub, no approval needed.

Compare that to OpenAI’s tightly curated ecosystem, and yeah… it’s a very Google move.

So between Gemini learning to use the web and devs building their own AI tools, we’re watching Google turn Gemini from a chatbot into an AI teammate.

You should check it out for yourself. 

Reply

or to participate

More From The Automated

No posts found