AIWalker
Author

AIWalker

Focused on computer vision, image processing, color science, and AI algorithms; sharing hardcore tech, engineering practice, and deep insights as a diligent AI technology practitioner.

162
Articles
0
Likes
86
Views
0
Comments
Recent Articles

Latest from AIWalker

100 recent articles max
AIWalker
AIWalker
Mar 8, 2026 · Artificial Intelligence

FireRed-Image-Edit v1.1 Boosts OOTD Element Fusion and Portrait Consistency

The Super Intelligence team at Xiaohongshu unveils FireRed-Image-Edit v1.1, an open‑source image‑editing model that dramatically improves ID‑consistent edits, multi‑element OOTD fusion, portrait makeup, and font style rendering while delivering end‑to‑end generation in 4.5 seconds on 30 GB VRAM, backed by a full training‑distillation pipeline and a technical report on arXiv.

AI modelFireRed-Image-EditLoRA
0 likes · 10 min read
FireRed-Image-Edit v1.1 Boosts OOTD Element Fusion and Portrait Consistency
AIWalker
AIWalker
Mar 7, 2026 · Artificial Intelligence

YOLO-Master v2026.02 Unveils Four Innovations for SOTA Object Detection

Tencent’s YOLO-Master v2026.02 adds a Mixture‑of‑Experts architecture, zero‑overhead LoRA fine‑tuning, Sparse SAHI inference for large images, and Cluster‑Weighted NMS, delivering 3‑5× faster inference, up to 70% reduced training resources, and markedly higher detection accuracy across diverse benchmarks.

LoRAMixture of ExpertsSparse Inference
0 likes · 15 min read
YOLO-Master v2026.02 Unveils Four Innovations for SOTA Object Detection
AIWalker
AIWalker
Mar 5, 2026 · Artificial Intelligence

How ViDA-UGC Leverages Large Multimodal Models for Fine-Grained Visual Quality Assessment

The article introduces ViDA-UGC, a large‑scale UGC visual‑quality dataset and its companion benchmark ViDA‑Bench, explains the MILP‑driven sampling, expert annotation pipeline, and CoT‑based evaluation framework, and shows how fine‑tuning popular multimodal LLMs on this data markedly improves low‑level quality perception, grounding, and description capabilities.

benchmarkchain-of-thoughtdataset
0 likes · 12 min read
How ViDA-UGC Leverages Large Multimodal Models for Fine-Grained Visual Quality Assessment
AIWalker
AIWalker
Mar 4, 2026 · Artificial Intelligence

Drifting Models Enable One‑Step Generation, Shattering Speed Records

The paper introduces Drifting Models, a new generative paradigm that moves the distribution evolution to the training phase, achieving true one‑step (1‑NFE) generation with state‑of‑the‑art ImageNet FID scores of 1.54 in latent space and 1.61 in pixel space, while eliminating the need for distillation or classifier‑free guidance.

Drifting ModelsGenerative ModelingImageNet
0 likes · 24 min read
Drifting Models Enable One‑Step Generation, Shattering Speed Records
AIWalker
AIWalker
Mar 3, 2026 · Artificial Intelligence

How NanoSD Cuts 90% Parameters to Enable Real‑Time Photo Editing on Mobile

NanoSD distills Stable Diffusion 1.5 into a 130 M‑parameter model that runs inference in 20 ms on a Qualcomm SM8750 NPU, using hardware‑aware module pruning, module‑level knowledge distillation, and Bayesian optimization to achieve Pareto‑optimal quality‑efficiency trade‑offs for on‑device image restoration.

Bayesian OptimizationKnowledge DistillationModel Compression
0 likes · 14 min read
How NanoSD Cuts 90% Parameters to Enable Real‑Time Photo Editing on Mobile
AIWalker
AIWalker
Mar 3, 2026 · Artificial Intelligence

RetouchIQ’s Instruction‑Driven AI Editing Overcomes Traditional Retouching Limits

RetouchIQ introduces an instruction‑driven AI retouching system that uses a general reward model to interpret abstract user commands, delivering precise image adjustments with higher semantic consistency and visual naturalness than existing multimodal large language models, thereby lowering the technical barrier for cinematic‑style edits.

AI Image EditingRetouchIQReward Model
0 likes · 3 min read
RetouchIQ’s Instruction‑Driven AI Editing Overcomes Traditional Retouching Limits
AIWalker
AIWalker
Mar 1, 2026 · Artificial Intelligence

How X2HDR Enables AI to Achieve True Transparent HDR Imaging

X2HDR tackles the long‑standing HDR generation problem by converting color data into a perceptual uniform space and applying LoRA lightweight fine‑tuning, dramatically boosting visual fidelity while slashing data and compute demands for film, gaming, and VR.

AIHDR imagingLoRA
0 likes · 3 min read
How X2HDR Enables AI to Achieve True Transparent HDR Imaging
AIWalker
AIWalker
Feb 27, 2026 · Artificial Intelligence

YOLO26 Review: End-to-End, NMS‑Free Edge AI Boosts CPU Inference by 43%

This article analyzes YOLO26’s architecture redesign that eliminates NMS, removes DFL, introduces progressive loss balancing, STAL, and the MuSGD optimizer, achieving up to 43% faster CPU inference and simplifying deployment for edge vision tasks across detection, segmentation, classification, pose estimation, and OBB.

CPU inferenceModel DeploymentNMS-free
0 likes · 13 min read
YOLO26 Review: End-to-End, NMS‑Free Edge AI Boosts CPU Inference by 43%
AIWalker
AIWalker
Feb 26, 2026 · Artificial Intelligence

Overcoming Vision Transformer Bottlenecks: The Plug‑and‑Play Upgrade of ViT‑5

ViT‑5 systematically revisits five years of Transformer architecture advances, introducing seven plug‑and‑play components—LayerScale, RMSNorm, GeLU, dual positional encodings, high‑frequency RoPE for register tokens, QK‑Norm, and bias‑free projections—that together raise ImageNet‑1k Top‑1 accuracy to 84.2% (Base) and achieve superior performance across classification, generation, and segmentation tasks.

ViT-5Vision Transformercomputer vision
0 likes · 14 min read
Overcoming Vision Transformer Bottlenecks: The Plug‑and‑Play Upgrade of ViT‑5