Tag

diffusion model

0 views collected around this technical thread.

Kuaishou Tech
Kuaishou Tech
Feb 20, 2025 · Artificial Intelligence

Second Short-Form Video Quality Assessment and Enhancement Challenge (CVPR NTIRE 2025)

The second short-form video quality assessment and enhancement challenge, co‑organized by Kuaishou's audio‑video team and the Intelligent Media Computing Lab, invites global researchers to develop efficient quality assessment models and diffusion‑based super‑resolution methods using the new KwaiSR dataset, with prize money and potential CVPR workshop paper invitations.

AI competitionCVPR NTIREImage Super-Resolution
0 likes · 9 min read
Second Short-Form Video Quality Assessment and Enhancement Challenge (CVPR NTIRE 2025)
AntTech
AntTech
Dec 19, 2024 · Artificial Intelligence

Framer: Interactive Video Frame Interpolation Using Diffusion Models

Framer is an interactive video frame interpolation method that leverages large‑pretrained video diffusion models, allowing users to define custom motion trajectories or use an automatic mode, and demonstrates strong performance in image deformation, video generation, and cartoon‑to‑video applications.

AIComputer VisionFramer
0 likes · 4 min read
Framer: Interactive Video Frame Interpolation Using Diffusion Models
DaTaobao Tech
DaTaobao Tech
Nov 27, 2024 · Artificial Intelligence

FuseAnyPart: Diffusion‑Driven Facial Parts Swapping via Multiple Reference Images

FuseAnyPart is a diffusion‑model‑based facial part swapping technique that fuses features from multiple reference images via mask‑based fusion and additive injection modules, delivering high‑fidelity, consistent face edits with lower computational cost, outperforming prior methods on CelebA‑HQ and FaceForensics++ and already boosting commercial AIGC applications.

Computer Visiondiffusion modelfacial part swapping
0 likes · 9 min read
FuseAnyPart: Diffusion‑Driven Facial Parts Swapping via Multiple Reference Images
360 Tech Engineering
360 Tech Engineering
Oct 31, 2024 · Artificial Intelligence

HiCo: Hierarchical Controllable Diffusion Model for Layout-to-Image Generation

The paper introduces HiCo, a hierarchical controllable diffusion model that enables precise layout‑to‑image generation by decoupling object and background features through weight‑shared branches and a fusion module, achieving high‑quality results and efficient inference as demonstrated on the HiCo‑7K benchmark.

AI PaintingHiCoImage Generation
0 likes · 9 min read
HiCo: Hierarchical Controllable Diffusion Model for Layout-to-Image Generation
Alimama Tech
Alimama Tech
Oct 17, 2024 · Artificial Intelligence

FLUX ControlNet Inpainting and 8-Step Turbo Acceleration Models

Alibaba’s Mama Intelligent Creation team has open‑sourced a FLUX‑based ControlNet inpainting model that leverages a DiT‑backed Interleave design for superior repair quality, and an 8‑step LoRA‑Turbo model that cuts inference time three‑fold while preserving near‑original image fidelity, both now available on Hugging Face and ModelScope.

AIControlNetFlux
0 likes · 9 min read
FLUX ControlNet Inpainting and 8-Step Turbo Acceleration Models
Model Perspective
Model Perspective
Sep 29, 2024 · Fundamentals

What Do Broadcast, Diffusion, and SIR Models Reveal About Article Virality?

This article explores how the broadcast, diffusion, and SIR mathematical models explain the rise, spread, and eventual decline of online articles, offering practical insights for boosting initial reach and sustaining reader interest through strategic sharing and content design.

SIR modelbroadcast modeldiffusion model
0 likes · 7 min read
What Do Broadcast, Diffusion, and SIR Models Reveal About Article Virality?
Kuaishou Tech
Kuaishou Tech
Sep 27, 2024 · Artificial Intelligence

XPSR: Cross‑modal Priors for Diffusion‑based Image Super‑Resolution

The paper introduces XPSR, a diffusion‑based image super‑resolution method that incorporates cross‑modal semantic priors from a large multimodal language model, achieving state‑of‑the‑art performance on both reference and no‑reference quality metrics across synthetic and real‑world video restoration tasks.

AI researchECCV2024Image Super-Resolution
0 likes · 8 min read
XPSR: Cross‑modal Priors for Diffusion‑based Image Super‑Resolution
Baidu Tech Salon
Baidu Tech Salon
May 24, 2024 · Artificial Intelligence

HelixDock: A Large-Scale Pretrained Full-Atom Diffusion Model for Protein–Small Molecule Docking

HelixDock, a full‑atom diffusion model pretrained on a billion‑scale simulated docking dataset covering ~200,000 protein targets, delivers state‑of‑the‑art docking accuracy—85.6% success on PoseBusters and strong generalization on cross‑docking benchmarks—showing that massive data and model scaling dramatically improve AI‑driven drug discovery, and its code and data are fully open‑source.

AI for drug discoveryHelixDockdeep learning
0 likes · 6 min read
HelixDock: A Large-Scale Pretrained Full-Atom Diffusion Model for Protein–Small Molecule Docking
360 Tech Engineering
360 Tech Engineering
Apr 17, 2024 · Artificial Intelligence

HiCo: A Hierarchical Controllable Diffusion Model for Layout‑to‑Image Generation

The 360 AI Research Institute introduces HiCo, a hierarchical controllable diffusion model that enables fine‑grained layout control across up to eight image regions, integrates seamlessly with existing Stable Diffusion ecosystems, and demonstrates superior performance on the GRIT‑VAL benchmark for layout‑aware image synthesis.

AI drawingHiCocontrollable generation
0 likes · 8 min read
HiCo: A Hierarchical Controllable Diffusion Model for Layout‑to‑Image Generation
JD Retail Technology
JD Retail Technology
Apr 10, 2024 · Artificial Intelligence

AI-Generated E-commerce Advertising Images: Relationship-Aware Diffusion Models for Layout, Background, and Poster Generation

This article analyzes the challenges of manual e‑commerce ad image creation and presents JD's innovative AI solutions—including a relationship‑aware diffusion model for poster layout, a category‑common and personalized background generator, and an end‑to‑end planning‑and‑rendering framework—that achieve high‑quality automatic ad creative generation and boost advertising revenue.

AIImage Generationdiffusion model
0 likes · 21 min read
AI-Generated E-commerce Advertising Images: Relationship-Aware Diffusion Models for Layout, Background, and Poster Generation
Architect
Architect
Mar 28, 2024 · Artificial Intelligence

Understanding OpenAI's Sora Video Generation Model: Architecture, Workflow, and Core Technologies

This article explains OpenAI's Sora video generation model, detailing its latent diffusion foundation, video compression network, spacetime patch representation, Diffusion Transformer processing, and decoding pipeline, while also reviewing related Stable Diffusion and Transformer concepts that enable high‑quality text‑to‑video synthesis.

AISoraTransformer
0 likes · 17 min read
Understanding OpenAI's Sora Video Generation Model: Architecture, Workflow, and Core Technologies
DaTaobao Tech
DaTaobao Tech
Mar 27, 2024 · Artificial Intelligence

Building a Simple Diffusion Model with Python

This tutorial walks through implementing a basic Denoising Diffusion Probabilistic Model in Python, explaining the forward noise schedule, reverse denoising training, and providing complete code for noise schedules, diffusion functions, residual and attention blocks, a UNet architecture, loss computation, and a training loop.

DDPMU-Netattention
0 likes · 26 min read
Building a Simple Diffusion Model with Python
DevOps
DevOps
Mar 26, 2024 · Artificial Intelligence

OpenAI’s Sora: A One‑Minute Text‑to‑Video Diffusion Transformer Model

OpenAI’s newly released Sora model demonstrates one‑minute text‑to‑video generation using a diffusion‑based transformer architecture that operates on spatiotemporal patches, compresses visual data into latent codes, and builds on a wide range of prior video generation research, while the article also advertises a DevOps certification program.

AIOpenAISora
0 likes · 8 min read
OpenAI’s Sora: A One‑Minute Text‑to‑Video Diffusion Transformer Model
DataFunTalk
DataFunTalk
Mar 21, 2024 · Artificial Intelligence

A Detailed Technical Analysis of Sora: Architecture, Key Components, and Potential Implementation

This article provides a comprehensive, easy‑to‑understand breakdown of Sora’s possible architecture—including its visual encoder‑decoder, Spacetime Latent Patch, transformer‑based diffusion model, long‑time consistency strategies, training techniques, and how it supports variable resolution and duration video generation.

AI architectureSoraSpacetime Patch
0 likes · 49 min read
A Detailed Technical Analysis of Sora: Architecture, Key Components, and Potential Implementation
DataFunTalk
DataFunTalk
Mar 18, 2024 · Artificial Intelligence

High-Fidelity Image-to-Video Generation for E‑commerce Product Motion with AtomoVideo and Noise Rectification

This article presents Alibaba's research on using diffusion‑based AIGC techniques, including a training‑free Noise Rectification module and the AtomoVideo model, to automatically convert static product images into high‑quality, detail‑preserving video motions for e‑commerce advertising.

AIGCAtomoVideoNoise Rectification
0 likes · 15 min read
High-Fidelity Image-to-Video Generation for E‑commerce Product Motion with AtomoVideo and Noise Rectification
Alimama Tech
Alimama Tech
Mar 14, 2024 · Artificial Intelligence

High-Fidelity Image-to-Video Generation for E-commerce with AtomoVideo and Noise Rectification

Alibaba’s AI team introduced AtomoVideo, a diffusion‑based image‑to‑video generator enhanced by a training‑free Noise Rectification module that adds and corrects controlled noise to eliminate first‑frame errors, enabling merchants to automatically create high‑fidelity 4‑second 720p product videos with strong temporal consistency for e‑commerce advertising.

AIAIGCdiffusion model
0 likes · 10 min read
High-Fidelity Image-to-Video Generation for E-commerce with AtomoVideo and Noise Rectification
DeWu Technology
DeWu Technology
Mar 11, 2024 · Artificial Intelligence

Understanding OpenAI's Sora Video Generation Model: Diffusion, Transformers, and Latent Space

OpenAI's Sora video generation model uses latent diffusion, a video compression encoder-decoder, tokenizes spatio-temporal patches, processes them with a diffusion‑trained Transformer conditioned on DALL·E‑style text annotations, then decodes to high‑resolution videos up to a minute long.

AISoraTransformer
0 likes · 18 min read
Understanding OpenAI's Sora Video Generation Model: Diffusion, Transformers, and Latent Space
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Feb 27, 2024 · Artificial Intelligence

InstantID: Zero-shot Identity-Preserving Generation in Seconds

InstantID, an open‑source tool released by Xiaohongshu in early 2024, generates multiple stylized portraits that preserve a person’s facial identity from a single reference photo in seconds, eliminating fine‑tuning, large storage needs, and multi‑image requirements while seamlessly working with popular diffusion models like Stable Diffusion 1.5 and SDXL.

AIImage GenerationInstantID
0 likes · 6 min read
InstantID: Zero-shot Identity-Preserving Generation in Seconds
High Availability Architecture
High Availability Architecture
Feb 22, 2024 · Artificial Intelligence

Understanding OpenAI’s Sora: A Breakthrough Text-to-Video Model

OpenAI’s newly released Sora text‑to‑video model demonstrates unprecedented high‑resolution, long‑duration video generation by encoding videos into latent space, applying diffusion with a transformer conditioned on text, and decoding back to pixels, marking a major leap in AI video synthesis and its potential applications.

AI video generationSoraTransformer
0 likes · 14 min read
Understanding OpenAI’s Sora: A Breakthrough Text-to-Video Model
Architects' Tech Alliance
Architects' Tech Alliance
Feb 22, 2024 · Artificial Intelligence

OpenAI’s Sora: A Breakthrough Text‑to‑Video Generation Model – Capabilities, Architecture, and Research Insights

OpenAI’s Sora model demonstrates unprecedented text‑to‑video generation with up to 60‑second high‑fidelity clips, consistent multi‑character scenes, multi‑camera motion, and world‑simulation abilities, backed by a diffusion‑transformer trained on compressed latent video patches and detailed technical analysis from its accompanying research paper.

AI video generationArtificial IntelligenceOpenAI
0 likes · 11 min read
OpenAI’s Sora: A Breakthrough Text‑to‑Video Generation Model – Capabilities, Architecture, and Research Insights