Agent Browser: AI-First Browser Automation CLI
A

Agent Browser: AI-First Browser Automation CLI

Agent Browser: AI-First Browser Automation CLI

CLI
37,599 stars
N/A forks
N/A contributors

README

Project documentation from GitHub

Meet Agent Browser: An AI-First CLI for Browser Automation

If you've ever tried to automate browser tasks with AI, you know the drill: cobble together Puppeteer, some LLM prompt engineering, and a prayer that the selector doesn't break. Vercel Labs just dropped something that skips most of that pain.

Agent Browser is an open-source CLI tool that uses an LLM (like GPT-4) to interpret natural language instructions and control a real browser behind the scenes. It's not a wrapper around Selenium or Playwright. It's a purpose-built automation agent that can navigate, click, fill forms, extract data, and even take screenshots all from a single command.

What It Does

You give Agent Browser a task in plain English. Something like:

"Go to Hacker News, find the top 5 posts by points, and save their titles and links to a CSV."

The tool spins up a headless Chromium instance, plans the steps, executes them (clicking, scrolling, waiting), and returns the result. The magic is that it decides how to do it, not just what to do.

Under the hood, it uses an LLM to break down your request into browser actions, then executes them with Playwright. It handles errors, retries, and even explains its reasoning as it goes.

Why It's Cool

No selectors, no scripts. You don't write XPath or CSS selectors. You don't handle pagination or loading states. The model figures that out.

It actually works for real tasks. I tested it on a few scraping jobs that would normally require a custom script. It handled login forms, infinite scroll, and modal popups without failing dramatically.

Open source and CLI-first. This isn't a SaaS product. You install it with npm, and it runs locally. Your API keys, your browser instance, your data.

Transparent execution. It prints every action it's taking. If it clicks the wrong thing, you see exactly where it went off track. That's rare in AI tools.

Use cases:

  • Quick data scraping without writing code
  • Automating repetitive browser tasks (filling forms, checking dashboards)
  • Testing web apps with natural language instructions
  • Building demos or prototypes that need real browser interaction

How to Try It

You need Node.js 18+ and an OpenAI API key (or any compatible LLM provider).

# Install globally
npm install -g @agent-browser/cli # Or run directly
npx @agent-browser/cli

Then run a task:

agent-browser "Go to https:

Did you like this issue?

Join our weekly newsletter

Related Projects

Love discovering amazing projects?

Help us continue bringing you the best open-source discoveries every week.

Back to Projects
Last updated: Apr 30, 2026