Google Unveils Gemini 3.5: Omni Multimodal Model and Flash Engine Redefine AI Capabilities

At Google I/O 2026, the company launched Gemini Omni, a truly multimodal model that generates video from any combination of inputs, and Gemini 3.5 Flash, which outperforms the previous Gemini 3.1 Pro across benchmarks, doubles token throughput, and powers new Agent‑first platforms like Antigravity 2.0 and Gemini Spark.

Top Architect
Top Architect
Top Architect
Google Unveils Gemini 3.5: Omni Multimodal Model and Flash Engine Redefine AI Capabilities

During the Google I/O 2026 keynote, Google introduced Gemini Omni, a "truly all‑purpose" multimodal model capable of accepting arbitrary combinations of images, audio, video, and text and producing high‑quality video output. The launch demo showed a prompt "Explain protein folding with clay animation" generating a scientifically accurate animation of amino‑acid chains forming α‑helices and β‑sheets, and another prompt mapping each English letter to a distinct object (e.g., C → capybara, D → disco ball, L → lava lamp) that demonstrated genuine cross‑modal understanding rather than simple collage.

Gemini Omni also supports iterative editing: after generating an initial scene, users can ask the model to place the same character in a new environment, adjust camera angles, or continue the action, with the model preserving character consistency, physical logic, and scene memory across multiple rounds.

The second headline product, Gemini 3.5 Flash, was presented as the strongest coding and agent model to date. In almost every benchmark it surpasses the previous flagship Gemini 3.1 Pro. Reported scores include Terminal‑Bench 2.1 (coding) – 76.2 %, GDPval‑AA (real‑world agent tasks) – 1656 Elo, MCP Atlas (large‑scale tool use) – 83.6 %, and CharXiv Reasoning (multimodal understanding) – 84.2 %. Compared with GPT‑5.5 and Claude Opus 4.7, Flash is up to four times faster, achieving 289 tokens / s.

Flash’s performance gains are attributed to the new Antigravity 2.0 platform, which transforms the previous IDE‑centric agent development environment into a standalone desktop application that embraces an Agent‑first design. Antigravity 2.0 introduces dynamic sub‑agent generation, asynchronous task management, scheduled tasks (e.g., daily PR checks), and new slash commands such as /goal, /grill‑me, and /browser. A live demo built an operating‑system kernel from scratch using 93 parallel agents, issuing over 15 000 model calls and processing 2.6 billion tokens in 12 hours, with all code, memory management, and file‑system components written, tested, and audited by agents. The API cost for this demonstration was under $1 000.

Google also announced Gemini Spark, a 7 × 24 h personal AI agent that integrates tightly with Gmail, Docs, Sheets, and Google Slides. In a workplace scenario, Spark drafted a weekly summary email by automatically retrieving relevant information from Gmail, Docs, and chat logs, then applying a custom "ghostwriter" skill to match the presenter’s tone. In a personal scenario, Spark organized a neighborhood block party: it created a Google Sheet RSVP tracker linked to Gmail, generated a Google Slides deck with event details, and sent personalized invitation emails—all without the user opening any application. Spark also supports voice input; a spoken command containing three tasks was parsed into separate parallel threads and executed automatically.

Pricing details were disclosed: the AI Ultra subscription now costs $100 / month for Spark beta access, and the top‑tier plan was reduced from $250 to $200. Gemini 3.5 Flash becomes the default model for Gemini App, Google Search AI mode, and is available to developers via the Antigravity 2.0 SDK, Gemini API, and Google AI Studio, while enterprise customers can access it through the Gemini Enterprise Agent Platform. A more powerful Gemini 3.5 Pro is currently in internal testing and slated for release next month.

The overall message of the event was that Google has aligned three critical capabilities—full‑modal understanding, full‑modal generation, and always‑on agents—into a single ecosystem, effectively removing the "technical impossibility" barrier to artificial general intelligence and shifting the challenge to deployment speed.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

multimodal AIBenchmarkGenerative AIGoogle I/OAgent PlatformAntigravityGemini SparkGemini 3.5Gemini Omni
Top Architect
Written by

Top Architect

Top Architect focuses on sharing practical architecture knowledge, covering enterprise, system, website, large‑scale distributed, and high‑availability architectures, plus architecture adjustments using internet technologies. We welcome idea‑driven, sharing‑oriented architects to exchange and learn together.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.