Artificial Intelligence 17 min read

How Claude Code and BrowserAct Let AI Control Browsers with a Single Prompt

BrowserAct equips AI agents like Claude Code and Cursor with a reusable skill layer that can open pages, click buttons, fill forms, reuse login sessions, bypass anti‑automation checks, and return structured data, demonstrated through Amazon scraping, GitHub issue reading, smoke testing, and reusable Skill Forge workflows.

Sohu Tech Products

May 20, 2026

How Claude Code and BrowserAct Let AI Control Browsers with a Single Prompt

What makes BrowserAct powerful

When an AI agent such as Claude Code or Cursor is asked to “check a website”, static curl may work but many modern sites render dynamically, require login, pagination, or trigger anti‑automation challenges. BrowserAct packages opening pages, clicking buttons, filling forms, reusing login sessions, scrolling, taking screenshots, and capturing XHR/fetch/HAR requests, and returns clean, structured data to the agent.

BrowserAct Skill overview

Two installable skills are provided: browser-act: a browser execution layer for ad‑hoc tasks (open page, click, input, screenshot, extract data, capture requests, visual checks). It supports two browser paths: a stealth headless browser that evades verification, and a “Real Chrome Control” that imports the local Chrome profile, cookies and extensions. browser-act-skill-forge: a tool that records a repeated web task as a reusable Skill. It first tries to discover a stable API endpoint; if none is found it falls back to DOM scraping, then generates SKILL.md and supporting scripts.

The difference is that browser-act runs the current task, while browser-act-skill-forge persists the process for future reuse.

Quick start

In an agent environment (Claude Code, Cursor, OpenClaw) the shortest way is to ask the agent to install the skills, then list the installed core skills:

请帮我安装 BrowserAct Skill，并检查运行环境。
安装完成后，执行 browser-act get-skills core --skill-version 2.0.0

To add the Forge skill:

请继续安装 browser-act-skill-forge，并确认它可以正常读取 Skill 说明。

Typical CLI steps are:

npx skills add browser-act/skills --skill browser-act
npx skills add browser-act/skills --skill browser-act-skill-forge
browser-act get-skills core --skill-version 2.0.0

Real‑world demo 1 – Reading logged‑in GitHub Issues

Prompt used:

使用 browser-act 进入我已登录的 GitHub 仓库，读取最近 10 个 open issue，
按 bug / feature / question 分类，输出：
1. 标题
2. 关键上下文
3. 可能涉及的模块
4. 建议优先级
5. 是否需要我进一步确认

BrowserAct first lists local Chrome profiles, creates a Chrome instance that imports the user’s cookies and localStorage, then navigates to the Snailclimb/interview-guide Issues page. After confirmation it opens the page, finds 6 open issues, visits each detail page, extracts title, context, module, suggested priority, and whether further confirmation is needed, and finally outputs a classification report: 3 Features, 2 Bugs, 1 Question.

Real‑world demo 2 – Smoke‑testing a modified front‑end page

Prompt used:

打开 http://localhost:3000/alerts
新增一个价格预警：
股票代码：sh600585
条件：价格低于 19
记录页面中出现的问题，包括字段校验、枚举文案、按钮状态、错误提示和截图。

BrowserAct opens the alert‑center page, clicks “New Alert”, fills stock code sh600585, switches the condition to “price below”, enters target price 19, captures screenshots at each step, and finally deletes the test alert to restore the environment. The run reveals two concrete issues: the default price 0.01 is unsuitable for A‑share alerts, and the trigger‑condition wording is ambiguous.

Real‑world demo 3 – Using Skill Forge to research GitHub projects

Prompt used to create a reusable “github‑repo‑research” skill:

请使用 browser-act-skill-forge，帮我 Forge 一个 GitHub 项目调研 Skill。
Skill 名称：github-repo-research
输入：
- https://github.com/browser-act/skills
- https://github.com/microsoft/playwright
- https://github.com/modelcontextprotocol/servers
输出字段：
1. 项目名称
2. GitHub URL
3. Star 数
4. Fork 数
5. 最近更新时间
6. License
7. README 里的安装方式
8. 是否有 examples / demos
9. 是否提到 Skill / MCP / CLI
10. 是否适合 Claude Code / Cursor / Codex 这类 Agent 使用
11. 适合写进文章的亮点
12. 不确定或需要人工确认的信息
要求：
- 优先探索稳定数据来源，能用 API 就不要硬扒页面
- README 没写的信息标记为“未确认”，不要猜
- 输出 Markdown 表格，同时保存 JSON
- 先用上面 3 个公开仓库跑通端到端测试，再生成可复用 Skill

Forge first prefers the GitHub API over DOM scraping, generates a Python script research-repos.py and a SKILL.md. The resulting skill can be invoked later without re‑discovering the data source.

.agents/skills/github-repo-research/
├── SKILL.md
└── scripts/
    └── research-repos.py

When to use an API key

Short‑lived tasks run with browser-act need no API key. Tasks that require a persistent login session use the Real Chrome path. Heavy‑weight scenarios—large‑scale extraction, cross‑page search, long‑running monitoring, or frequent anti‑bot challenges—benefit from an API key obtained from the BrowserAct website, but the key does not bypass site rules and user confirmation is still required.

Conclusion

BrowserAct turns browser interactions into a layer that agents can call: automatic page navigation, form filling, login‑session reuse, request capture, and screenshot logging, while sensitive actions pause for human approval. Integrating it into daily development saves considerable manual effort.

GitHub repository: https://github.com/browser-act/skills

Official guide: https://browseract.ai/guide

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI Agents E2E testing GitHub web automation Claude Code BrowserAct Skill Forge

Written by

Sohu Tech Products

A knowledge-sharing platform for Sohu's technology products. As a leading Chinese internet brand with media, video, search, and gaming services and over 700 million users, Sohu continuously drives tech innovation and practice. We’ll share practical insights and tech news here.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.