Tagged articles

31 articles

Page 1 of 1

May 12, 2026 · Artificial Intelligence

How DreamLite Enables Real-Time Text-to-Image Generation and Editing on Mobile Devices

DreamLite, a 0.39 B‑parameter diffusion model from ByteDance, unifies text‑to‑image generation and text‑guided editing in a single on‑device network, delivering 1024×1024 results in about three seconds on an iPhone 17 Pro while surpassing existing mobile and even many server‑side baselines.

DreamLiteModel CompressionRLHF

0 likes · 9 min read

How DreamLite Enables Real-Time Text-to-Image Generation and Editing on Mobile Devices

SuanNi

May 7, 2026 · Artificial Intelligence

DreamLite: A 0.39B Mobile Model Matching Z‑Image for Real‑Time Text‑to‑Image Generation and Editing

DreamLite is a compact 0.39 B unified diffusion model open‑sourced by ByteDance that runs on smartphones, delivering text‑to‑image generation and text‑guided editing in about three seconds for 1024×1024 pictures, with performance comparable to Flux, Z‑Image and LongCat‑Image and offering two variants to balance fidelity and latency.

AI modelByteDanceDreamLite

0 likes · 4 min read

DreamLite: A 0.39B Mobile Model Matching Z‑Image for Real‑Time Text‑to‑Image Generation and Editing

Geek Labs

Apr 28, 2026 · Artificial Intelligence

ChatGPT and AI Tool Open-Source Projects: Multi-Account Scheduling, Image Editing API, AWS Auto-Registration

This article introduces four GitHub open‑source projects—gpt2api, chatgpt2api, kiro-auto, and hermes‑webui—that enable high‑concurrency multi‑account ChatGPT usage, DALL‑E image generation and editing, automated AWS Builder ID registration, and cross‑platform access to Hermes agents, each with usage instructions and target audiences.

AI toolsAWS automationChatGPT

0 likes · 7 min read

ChatGPT and AI Tool Open-Source Projects: Multi-Account Scheduling, Image Editing API, AWS Auto-Registration

JD Cloud Developers

Apr 8, 2026 · Artificial Intelligence

How JoyAI-Image-Edit Brings Spatial Intelligence to Open‑Source Image Editing

JoyAI-Image-Edit, an open‑source multimodal foundation model from JD Research Institute, integrates text‑to‑image generation, image understanding, and instruction‑driven spatial editing, achieving world‑leading spatial perception and editing capabilities that unlock new applications across e‑commerce, robotics, 3D reconstruction, and design.

Generative Modelscomputer visionimage editing

0 likes · 7 min read

How JoyAI-Image-Edit Brings Spatial Intelligence to Open‑Source Image Editing

Machine Learning Algorithms & Natural Language Processing

Mar 15, 2026 · Artificial Intelligence

HY‑WU: Real‑Time Adaptive AI Model That Generates Parameters On‑The‑Fly

HY‑WU demonstrates that generating model parameters dynamically during inference enables a single foundation model to perform diverse image‑editing tasks, outperforming fixed‑parameter baselines in human and automatic evaluations, benchmark tests, and conflict‑task experiments, highlighting a practical real‑time adaptation approach for AI systems.

HY-WULoRATransformer

0 likes · 16 min read

HY‑WU: Real‑Time Adaptive AI Model That Generates Parameters On‑The‑Fly

AIWalker

Mar 8, 2026 · Artificial Intelligence

FireRed-Image-Edit v1.1 Boosts OOTD Element Fusion and Portrait Consistency

The Super Intelligence team at Xiaohongshu unveils FireRed-Image-Edit v1.1, an open‑source image‑editing model that dramatically improves ID‑consistent edits, multi‑element OOTD fusion, portrait makeup, and font style rendering while delivering end‑to‑end generation in 4.5 seconds on 30 GB VRAM, backed by a full training‑distillation pipeline and a technical report on arXiv.

AI modelFireRed-Image-EditLoRA

0 likes · 10 min read

FireRed-Image-Edit v1.1 Boosts OOTD Element Fusion and Portrait Consistency

SuanNi

Mar 7, 2026 · Artificial Intelligence

How HY‑WU Enables Real‑Time Dynamic Parameters for Large‑Scale AI Models

Tencent's HY‑WU architecture introduces functional memory that generates task‑specific parameters on the fly, overcoming catastrophic forgetting and static‑weight limitations, and demonstrates superior performance in image‑editing benchmarks compared to leading open‑source and closed‑source models.

AI ArchitectureTencentdynamic-parameters

0 likes · 12 min read

How HY‑WU Enables Real‑Time Dynamic Parameters for Large‑Scale AI Models

SuanNi

Feb 23, 2026 · Artificial Intelligence

How FireRed-Image-Edit Sets New Standards for AI-Powered Image Editing

FireRed-Image-Edit, an open‑source instruction‑driven diffusion model, combines massive high‑quality data, a dual‑stream multimodal architecture, progressive training, and a comprehensive multi‑dimensional benchmark to achieve unprecedented pixel‑level control and human‑like editing performance across diverse visual tasks.

AITraining Strategiesdata engineering

0 likes · 12 min read

How FireRed-Image-Edit Sets New Standards for AI-Powered Image Editing

AI Algorithm Path

Feb 8, 2026 · Artificial Intelligence

Qwen Multi-Angle: An Open‑Source AI Tool for Full‑Perspective Image Reconstruction

The open‑source Qwen‑Image‑Edit‑2511‑Multiple‑Angles‑LoRA model can reconstruct images from 96 preset camera poses, letting users adjust distance, pitch and yaw to generate realistic multi‑angle views, with step‑by‑step usage instructions, example results, practical applications, and noted limitations.

AIOpen SourceQwen

0 likes · 6 min read

Qwen Multi-Angle: An Open‑Source AI Tool for Full‑Perspective Image Reconstruction

HyperAI Super Neural

Dec 25, 2025 · Artificial Intelligence

How Qwen-Image-Layered Enables Precise, High‑Fidelity Image Layer Editing

The article introduces the Qwen‑Image‑Layered model, which solves the long‑standing AI image‑editing limitation of inseparable layers by decomposing images into independent RGBA layers that retain fidelity under scaling, repositioning and recoloring, and provides a step‑by‑step online tutorial to try the feature.

AI image generationHyperAI tutorialQwen-Image-Layered

0 likes · 5 min read

How Qwen-Image-Layered Enables Precise, High‑Fidelity Image Layer Editing

DeWu Technology

Dec 25, 2025 · Frontend Development

Build a High‑Performance H5 PAG Player: SDK, Image Editing, Batch Synthesis

This guide details how to implement a full‑stack H5 PAG player for the “Use Basketball to Know Me” activity, covering SDK loading, canvas‑based image manipulation (drag, scale, rotate), dynamic layer and text replacement, real‑time preview synchronization, snapshot export, batch synthesis, performance tuning, and fallback strategies.

Batch ProcessingCanvasPAG

0 likes · 30 min read

Build a High‑Performance H5 PAG Player: SDK, Image Editing, Batch Synthesis

Alimama Tech

Oct 15, 2025 · Artificial Intelligence

How Alibaba’s Taobao Starry Model Delivers Precise, Consistent E‑commerce Image Edits

Alibaba’s Taobao Starry Image Editing model tackles the e‑commerce challenge of maintaining visual consistency by introducing a high‑fidelity, plug‑in architecture, a million‑scale consistency dataset, and multi‑stage multilingual training, enabling precise, controllable edits without altering product layout or background.

ConsistencyE-commerce AIdata engineering

0 likes · 10 min read

How Alibaba’s Taobao Starry Model Delivers Precise, Consistent E‑commerce Image Edits

Code Mala Tang

Sep 27, 2025 · Artificial Intelligence

5 Creative Ways to Edit Images with Google Nano Banana (Gemini 2.5 Flash)

This guide showcases five practical examples—removing objects, colorizing photos, adding billboard text, maintaining character consistency, and applying brand assets—demonstrating how Google Nano Banana’s advanced AI image editing can streamline visual design tasks.

AI artGemini 2.5Google AI

0 likes · 7 min read

5 Creative Ways to Edit Images with Google Nano Banana (Gemini 2.5 Flash)

AI Algorithm Path

Sep 3, 2025 · Artificial Intelligence

15 Real-World Applications of Google’s Nano Banana AI Image Tool

Google’s Nano Banana, an advanced multimodal AI model integrated into Gemini, delivers unprecedented role‑consistency and multi‑step editing, and this article walks through fifteen concrete use cases—from virtual try‑on and background swapping to style transfer, product visualisation, educational graphics, and 3D conversion—showcasing how the tool can streamline creative workflows across industries.

AI image generationGeminiGoogle

0 likes · 9 min read

15 Real-World Applications of Google’s Nano Banana AI Image Tool

21CTO

Aug 28, 2025 · Artificial Intelligence

What Is Nano Banana? The Mysterious AI Image Model Challenging Google’s Gemini

Nano Banana, an enigmatic AI image‑generation model that surfaced on forums and Discord without any official announcement, boasts unprecedented speed, consistency, and language‑driven editing, sparking speculation about Google’s involvement and reshaping workflows across e‑commerce, gaming, education, and design.

AI image generationGoogle speculationNano Banana

0 likes · 10 min read

What Is Nano Banana? The Mysterious AI Image Model Challenging Google’s Gemini

AI Algorithm Path

Aug 24, 2025 · Artificial Intelligence

Qwen-Image-Edit: Alibaba’s Open‑Source State‑of‑the‑Art Image Editing Model

Qwen-Image-Edit, built on the 20B‑parameter Qwen‑Image foundation, introduces a dual‑path architecture that simultaneously understands semantic intent and visual details, enabling precise semantic and appearance edits, robust text manipulation, and fine‑grained region control, with open‑source weights on HuggingFace and benchmark‑proven superiority over existing models.

AI image manipulationQwen-Image-Editdiffusers

0 likes · 7 min read

Qwen-Image-Edit: Alibaba’s Open‑Source State‑of‑the‑Art Image Editing Model

AI Algorithm Path

Jul 2, 2025 · Artificial Intelligence

Exploring the Open‑Source Flux.1 Kontext Dev Model for Advanced Image Editing

Black Forest Labs releases the open‑source Flux.1 Kontext Dev model, a 12‑billion‑parameter image‑editing system whose weights are publicly available; the article details its core features, benchmark‑level performance comparable to leading commercial models, access via HuggingFace, and step‑by‑step usage through Fal AI and Replicate APIs.

AI modelFal AIFlux.1

0 likes · 9 min read

Exploring the Open‑Source Flux.1 Kontext Dev Model for Advanced Image Editing

大转转FE

Jun 30, 2025 · Mobile Development

How a Custom Android Image Editor Boosts Warehouse Efficiency

This article details the design and implementation of a native Android image‑editing component built for warehouse quality‑inspection, covering business motivations, core features such as multi‑image batch editing, matrix‑based transformations, a command‑pattern undo/redo system, technical architecture, key challenges, and future extension plans.

AndroidCommand PatternCustom View

0 likes · 29 min read

How a Custom Android Image Editor Boosts Warehouse Efficiency

AntTech

Jun 15, 2025 · Artificial Intelligence

21 Ant Research Papers Shaping CVPR 2025: AI Image & Video Generation Breakthroughs

The Interactive Intelligence Lab of Ant Technology Research Institute presented 21 accepted CVPR 2025 papers covering visual generation, editing, 3D vision, digital humans and multimodal AI, highlighting tools such as MagicQuill, Lumos, Aurora, FLARE, LeviTor, MangaNinja, AniDoc, Mimir, AvatarArtist, DiffListener, MotionStone, TensorialGaussianAvatars, DualTalk, CompreCap and Uni-AD.

CVPR2025Video Generationcomputer vision

0 likes · 20 min read

21 Ant Research Papers Shaping CVPR 2025: AI Image & Video Generation Breakthroughs

Code Mala Tang

Jun 4, 2025 · Artificial Intelligence

Flux Kontext: How Open‑Weight AI Image Editing Beats GPT‑Image‑1

Flux Kontext, Black Forest Labs' new open‑weight AI image editing suite, enables fast, low‑cost contextual generation and editing with features such as role consistency, local edits, style transfer, and superior benchmark performance compared to GPT‑Image‑1, Imagen 4, and other leading models.

AI image generationFlux Kontextbenchmark performance

0 likes · 12 min read

Flux Kontext: How Open‑Weight AI Image Editing Beats GPT‑Image‑1

AIWalker

May 29, 2025 · Artificial Intelligence

ImgEdit-Bench Exposes Weak Image Editing Models – A ‘Death Test’ Reveals Who’s Struggling

ImgEdit introduces a large‑scale, high‑quality editing dataset and the ImgEdit‑Bench benchmark, detailing a robust data‑generation pipeline, multi‑round editing tasks, and a specialized evaluation model, and demonstrates through extensive experiments that its ImgEdit‑E1 model outperforms existing open‑source editors and narrows the gap with closed‑source systems.

AIVision-Language Modelbenchmark

0 likes · 20 min read

ImgEdit-Bench Exposes Weak Image Editing Models – A ‘Death Test’ Reveals Who’s Struggling

AI Frontier Lectures

May 23, 2025 · Artificial Intelligence

How SuperEdit Boosts Instruction-Based Image Editing with Rectified Supervision

SuperEdit introduces rectified instruction generation and contrastive supervision to fix noisy supervision in instruction‑based image editing, achieving up to 9.19% performance gains on Real‑Edit benchmarks without extra model parameters or pre‑training, and releases all data and code publicly.

diffusion modelsimage editingvisual-language models

0 likes · 15 min read

How SuperEdit Boosts Instruction-Based Image Editing with Rectified Supervision

AI Frontier Lectures

May 19, 2025 · Artificial Intelligence

How SuperEdit Boosts Instruction-Based Image Editing with Rectified Supervision

SuperEdit introduces rectified instruction generation and contrastive supervision to fix noisy training signals in instruction‑based image editing, achieving up to 9.19% performance gains without extra parameters or pre‑training, as demonstrated on the Real‑Edit benchmark.

diffusion modelsimage editingsupervision

0 likes · 13 min read

Amap Tech

Apr 21, 2025 · Artificial Intelligence

Lenna: Language‑Enhanced Reasoning Detection Assistant and a Chain‑of‑Thought Image Editing Framework Using Multimodal Large Language Models

At ICASSP 2025, Gaode’s two accepted papers present Lenna, a language‑enhanced reasoning detection assistant that adds a DET token to multimodal LLMs and achieves state‑of‑the‑art accuracy on RefCOCO benchmarks, and a chain‑of‑thought image‑editing framework that converts complex prompts into segmented masks and repair prompts for diffusion‑based inpainting, surpassing existing methods.

AIICASSPchain-of-thought

0 likes · 10 min read

Lenna: Language‑Enhanced Reasoning Detection Assistant and a Chain‑of‑Thought Image Editing Framework Using Multimodal Large Language Models

AIWalker

Apr 10, 2025 · Artificial Intelligence

DCEdit: Precise Text-Guided Image Editing that Preserves Backgrounds

DCEdit introduces a precise semantic localization strategy and a dual-level control mechanism for text‑guided image editing, delivering superior background preservation and editing quality, as demonstrated on the new RW‑800 benchmark and extensive comparisons with state‑of‑the‑art diffusion models.

AIbenchmarkdiffusion models

0 likes · 16 min read

DCEdit: Precise Text-Guided Image Editing that Preserves Backgrounds

AIWalker

Mar 23, 2025 · Artificial Intelligence

One-Click Removal & Seamless Integration: CycleFlow + Diffusion Prior Power OmniPaint

OmniPaint introduces a unified diffusion‑based framework that achieves physically consistent object removal and insertion by leveraging a pre‑trained FLUX‑1 diffusion prior, a progressive CycleFlow training pipeline, and a novel reference‑free CFD metric for high‑fidelity image editing.

CFD MetricCycleFlowObject Insertion

0 likes · 17 min read

One-Click Removal & Seamless Integration: CycleFlow + Diffusion Prior Power OmniPaint

Alibaba Cloud Big Data AI Platform

Oct 16, 2024 · Artificial Intelligence

How VICTORIA Revolutionizes Multi‑Object Image Editing with Language‑Aware Diffusion

The VICTORIA algorithm, presented by Alibaba Cloud AI Platform PAI and South China University of Technology at ACM MM 2024, leverages linguistic dependency parsing to guide cross‑attention in Stable Diffusion, enabling accurate, training‑free multi‑object image editing while preserving spatial structure and achieving state‑of‑the‑art results on benchmark datasets.

AI researchStable DiffusionVICTORIA

0 likes · 10 min read

How VICTORIA Revolutionizes Multi‑Object Image Editing with Language‑Aware Diffusion

Alibaba Cloud Developer

Jun 20, 2024 · Artificial Intelligence

Build Your Own AI Image Editing Assistant with Alibaba Cloud PAI‑DSW

This guide walks you through using Alibaba Cloud's PAI‑DSW and the Free Prompt Editing algorithm to set up a personal AI‑generated content (AIGC) drawing assistant, covering environment setup, instance creation, WebUI parameter tuning, example edits, resource cleanup, and how to share your creations for rewards.

AIGCAlibaba CloudPAI-DSW

0 likes · 6 min read

Build Your Own AI Image Editing Assistant with Alibaba Cloud PAI‑DSW

Alibaba Cloud Big Data AI Platform

Jun 18, 2024 · Artificial Intelligence

Free-Prompt-Editing: Efficient Text-Guided Image Editing with Stable Diffusion

The paper introduces Free-Prompt-Editing (FPE), a novel, efficient algorithm for text‑guided image editing that leverages probe analysis of cross‑ and self‑attention maps in Stable Diffusion, demonstrates its superiority over existing methods through extensive experiments, and provides open‑source implementation for both synthetic and real‑image editing.

AI researchStable Diffusionattention maps

0 likes · 12 min read

Free-Prompt-Editing: Efficient Text-Guided Image Editing with Stable Diffusion

Rare Earth Juejin Tech Community

Feb 28, 2024 · Artificial Intelligence

A Survey of Multimodal Image Synthesis and Editing with Generative AI

This comprehensive review examines the rapid advances in generative AI for multimodal image synthesis and editing, covering visual, textual, and audio guidance, model families such as GANs, diffusion, autoregressive, and NeRF, as well as datasets, challenges, and future research directions.

GaNNeRFdiffusion models

0 likes · 6 min read

A Survey of Multimodal Image Synthesis and Editing with Generative AI

Taobao Frontend Technology

Aug 13, 2021 · Frontend Development

How We Built a Rich Cover Image Editor with 9‑Patch Rendering and Multi‑Platform Canvas

This article details the design and implementation of a cross‑platform cover image editor for short videos, covering competitor analysis, 9‑patch image handling, positioning protocols, canvas‑based rendering steps, and future performance and feature enhancements.

9-patchCanvasUI

0 likes · 9 min read

How We Built a Rich Cover Image Editor with 9‑Patch Rendering and Multi‑Platform Canvas