Artificial Intelligence 9 min read

FLUX ControlNet Inpainting and 8-Step Turbo Acceleration Models

Alibaba’s Mama Intelligent Creation team has open‑sourced a FLUX‑based ControlNet inpainting model that leverages a DiT‑backed Interleave design for superior repair quality, and an 8‑step LoRA‑Turbo model that cuts inference time three‑fold while preserving near‑original image fidelity, both now available on Hugging Face and ModelScope.

Alimama Tech
Alimama Tech
Alimama Tech
FLUX ControlNet Inpainting and 8-Step Turbo Acceleration Models

Alibaba Mama's Intelligent Creation and AI Application team recently open‑sourced two practical companion models for the FLUX text‑to‑image diffusion model: a ControlNet image‑inpainting model and an 8‑step Turbo acceleration model based on FLUX.1‑dev.

The ControlNet model enables controllable image repair (inpainting) by integrating a ControlNet branch into the DiT architecture of FLUX. Because FLUX uses a Transformer‑based DiT backbone rather than a traditional UNet, the team explored full‑depth ControlNet structures and adopted an Interleave design to balance convergence and memory consumption.

Training was performed on tens of millions of filtered image‑text pairs, first at 768 px resolution (Alpha version) and then at 1024 px (Beta version). The Beta model improves resolution handling, detail generation, and prompt control while preserving the original content in non‑repaired regions.

Comparisons with the SDXL‑based inpainting model from Diffusers show that the FLUX‑Inpainting ControlNet achieves superior instruction following, visual quality, and consistency.

To address the high inference cost of the 12‑billion‑parameter FLUX model, the team distilled the model to an 8‑step LoRA‑based Turbo version using an improved consistency distillation algorithm with a multi‑head discriminator. The accelerated model reaches near‑original quality in only eight diffusion steps.

Benchmarks on an H20 machine (T5xxl‑fp16 + FLUX.1‑dev‑fp8) show inference time reduced from ~26 s (30 steps) to ~8 s (8 steps), a three‑fold speed‑up with minimal quality loss.

Both models have been released on Hugging Face and ModelScope, ranking high on community trend lists, and the team invites feedback and collaboration.

AIdiffusion modelControlNetFluxImage InpaintingTurbo Acceleration
Alimama Tech
Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.