When AI ‘Crayfish’ Takes Over Testing, Where Do 80% of Testers Go?
The article demonstrates how an LLM‑powered agent (nicknamed “crayfish”) equipped with OpenClaw and Playwright MCP can autonomously perform web‑testing tasks—handling environment setup, visual OCR, error recovery and reporting—showing a shift from fragile scripted automation to intent‑driven testing and warning that traditional test engineers have little time left to adapt.
The author revisits a previous prediction that 80% of test engineers would be displaced and now validates it with a hands‑on demo using an AI agent called 小龙虾 (crayfish), built on OpenClaw and the Playwright MCP skill set.
Traditional UI automation is described as labor‑intensive: engineers must write XPath/CSS selectors, craft fragile wait logic, and maintain large script bases, while environment preparation and debugging consume additional effort.
Task 1 – Simple Search and OCR : The author issues a concise natural‑language instruction to the agent, asking it to launch Chrome, search for “全程软件测试”, open the Baidu encyclopedia entry, capture the book cover, run OCR, and output a text file. A conventional script would require hundreds of lines and manual environment setup. The agent automatically installs the correct Chromium version when a mismatch is detected, resolves a missing pytesseract module, re‑installs the dependency, and proceeds to capture the correct screenshot. It then uses a multimodal model to extract the cover text and saves it as 封面.txt. The following images illustrate the intermediate steps and the final report:
The agent recovers from errors without human intervention, unlike the traditional script that would halt on the ModuleNotFoundError: No module named 'pytesseract' . Task 2 – Complex Multi‑Cover Search : A more challenging scenario is presented, requiring the agent to locate three editions of a book among hundreds of images, capture each cover, run OCR, and produce a structured report. The author notes that such a task would be a “nightmare” for conventional automation. The agent interprets the intent, uses visual reasoning to filter images, correctly identifies the first, second, and third editions (even when the order is not specified), and outputs screenshots, extracted text, and a concise pass/fail summary. Screenshots of the agent’s progress are included throughout the article. These demonstrations illustrate that the agent operates in an intent‑driven manner: users provide high‑level goals, and the AI autonomously decides how to navigate, handle environment mismatches, and apply vision models, eliminating the need for low‑level selectors or hard‑coded IDs. The author then reflects on the broader impact: the four traditional “moats” of test engineers—domain knowledge, testing methodology, test‑case design, and scripting expertise—are rapidly being flattened by LLM agents. With mature LLM + MCP toolchains, future testing may no longer require manual element locating, regex writing, or maintaining massive locator libraries; business analysts can describe requirements in natural language and the agent will generate, execute, debug, and report tests. Finally, the article cautions that while AI agents can act as powerful executors, human oversight remains essential for defining correct business logic, governing agent behavior, and ensuring safety and boundary conditions in large‑scale deployments.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Software Engineering 3.0 Era
With large models (LLMs) reshaping countless industries, software engineering is leading the charge into the Software Engineering 3.0 era—model-driven development and operations. This account focuses on the new paradigms, theories, and methods of SE 3.0, and showcases its tools and practices.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
