Artificial Intelligence 10 min read

Hyper‑SD: Trajectory‑Segmented Consistency Model for Accelerating Diffusion Image Generation

Hyper‑SD introduces a trajectory‑segmented consistency distillation framework that combines trajectory‑preserving and trajectory‑reconstruction strategies, integrates human‑feedback learning and score distillation, and achieves state‑of‑the‑art low‑step image generation performance on both SD1.5 and SDXL models.

Rare Earth Juejin Tech Community

May 1, 2024

Hyper‑SD: Trajectory‑Segmented Consistency Model for Accelerating Diffusion Image Generation

Introduction

Recent diffusion models have achieved impressive results in image and video generation, but their multi‑step denoising process incurs high computational cost. Existing acceleration methods fall into trajectory‑preserving distillation and trajectory‑reconstruction distillation, each limited by performance ceilings or output‑domain shifts.

To overcome these issues, ByteDance’s research team proposes Hyper‑SD, a trajectory‑segmented consistency model that blends the advantages of both strategies and has been endorsed by HuggingFace’s CEO.

Method

1. Trajectory‑Segmented Consistency Distillation

The approach divides the full time interval [0, T] into k segments and performs consistency distillation on each segment, gradually reducing k (8 → 4 → 2 → 1) to achieve full‑time consistency. Training loss combines adversarial and MSE components, with dynamic weighting and noise perturbation for stability.

2. Human‑Feedback Learning

Human aesthetic preferences and visual perception models (e.g., LAION aesthetic predictor, ImageReward, and instance‑segmentation models such as SOLO) are used as reward signals to guide the accelerated model toward more visually pleasing and structurally coherent outputs.

3. One‑Step Generation Enhancement

Score distillation (via Distribution‑Matching Distillation) is applied to improve one‑step generation, merging teacher‑model and student‑model distributions and combining MSE with score‑based losses, while also incorporating the previously described human‑feedback signals.

Experiments

Quantitative comparisons on SD1.5 and SDXL show Hyper‑SD significantly outperforms current state‑of‑the‑art acceleration algorithms across various step counts (1‑8). Visualizations confirm superior low‑step inference quality, and extensive user studies corroborate its advantage.

Hyper‑SD’s LoRA adapters are compatible with diverse style backbones and can be combined with ControlNet for controllable low‑step generation.

Conclusion

The paper presents Hyper‑SD, a unified diffusion‑model acceleration framework that delivers SOTA low‑step generation for both SD1.5 and SDXL by leveraging trajectory‑segmented consistency distillation, human‑feedback learning, and score distillation, and releases open‑source code, LoRA plugins, and a one‑step SDXL model to foster community progress.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

diffusion models RLHF model distillation AI acceleration hyper-sd

Written by

Rare Earth Juejin Tech Community

Juejin, a tech community that helps developers grow.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.