Tag

mobile agent

1 views collected around this technical thread.

DataFunSummit
DataFunSummit
Jul 30, 2024 · Artificial Intelligence

Multimodal Mobile AI Agent (Mobile‑Agent): From V1 to V2 and Open‑Source Practice

This article introduces Alibaba Tongyi Lab's multimodal mobile AI agent, Mobile‑Agent, covering the background of large‑model agents, the design and capabilities of V1 and V2, the multi‑agent framework, evaluation results, open‑source resources, and future development directions.

AI planningMulti-AgentOpen Source
0 likes · 13 min read
Multimodal Mobile AI Agent (Mobile‑Agent): From V1 to V2 and Open‑Source Practice
DataFunTalk
DataFunTalk
Feb 5, 2024 · Artificial Intelligence

Mobile-Agent: An Autonomous Multi‑Modal Mobile Device Agent with Visual Perception

The Mobile-Agent paper presents a vision‑only, autonomous multi‑modal AI system that can interpret user commands, locate UI elements on a smartphone screen, and execute complex tasks such as browsing, commenting, and content creation through a defined operation space, self‑planning, and self‑reflection mechanisms, achieving high success rates across diverse Chinese and English scenarios.

Mobile Automationautonomous operationmobile agent
0 likes · 7 min read
Mobile-Agent: An Autonomous Multi‑Modal Mobile Device Agent with Visual Perception