Tag

image editing

1 views collected around this technical thread.

AntTech
AntTech
Jun 15, 2025 · Artificial Intelligence

21 Ant Research Papers Shaping CVPR 2025: AI Image & Video Generation Breakthroughs

The Interactive Intelligence Lab of Ant Technology Research Institute presented 21 accepted CVPR 2025 papers covering visual generation, editing, 3D vision, digital humans and multimodal AI, highlighting tools such as MagicQuill, Lumos, Aurora, FLARE, LeviTor, MangaNinja, AniDoc, Mimir, AvatarArtist, DiffListener, MotionStone, TensorialGaussianAvatars, DualTalk, CompreCap and Uni-AD.

CVPR2025Multimodal Modelscomputer vision
0 likes · 20 min read
21 Ant Research Papers Shaping CVPR 2025: AI Image & Video Generation Breakthroughs
Code Mala Tang
Code Mala Tang
Jun 4, 2025 · Artificial Intelligence

Flux Kontext: How Open‑Weight AI Image Editing Beats GPT‑Image‑1

Flux Kontext, Black Forest Labs' new open‑weight AI image editing suite, enables fast, low‑cost contextual generation and editing with features such as role consistency, local edits, style transfer, and superior benchmark performance compared to GPT‑Image‑1, Imagen 4, and other leading models.

AI image generationFlux Kontextbenchmark performance
0 likes · 12 min read
Flux Kontext: How Open‑Weight AI Image Editing Beats GPT‑Image‑1
Amap Tech
Amap Tech
Apr 21, 2025 · Artificial Intelligence

Lenna: Language‑Enhanced Reasoning Detection Assistant and a Chain‑of‑Thought Image Editing Framework Using Multimodal Large Language Models

At ICASSP 2025, Gaode’s two accepted papers present Lenna, a language‑enhanced reasoning detection assistant that adds a DET token to multimodal LLMs and achieves state‑of‑the‑art accuracy on RefCOCO benchmarks, and a chain‑of‑thought image‑editing framework that converts complex prompts into segmented masks and repair prompts for diffusion‑based inpainting, surpassing existing methods.

AIChain-of-ThoughtICASSP
0 likes · 10 min read
Lenna: Language‑Enhanced Reasoning Detection Assistant and a Chain‑of‑Thought Image Editing Framework Using Multimodal Large Language Models
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Feb 28, 2024 · Artificial Intelligence

A Survey of Multimodal Image Synthesis and Editing with Generative AI

This comprehensive review examines the rapid advances in generative AI for multimodal image synthesis and editing, covering visual, textual, and audio guidance, model families such as GANs, diffusion, autoregressive, and NeRF, as well as datasets, challenges, and future research directions.

GANdiffusion modelsgenerative AI
0 likes · 6 min read
A Survey of Multimodal Image Synthesis and Editing with Generative AI