Author

AIWalker

Focused on computer vision, image processing, color science, and AI algorithms; sharing hardcore tech, engineering practice, and deep insights as a diligent AI technology practitioner.

162

Articles

Likes

Views

Comments

Latest from AIWalker

100 recent articles max

AIWalker

May 21, 2026 · Artificial Intelligence

AnyFlow: Generate High‑Quality Video in 4 Steps with Unlimited Sampling Improvement

AnyFlow introduces a flow‑map distillation framework that enables video diffusion models to produce high‑quality results in just four steps while continuously improving with additional sampling steps, supporting both causal and bidirectional architectures up to 14 B parameters and allowing downstream fine‑tuning.

AI video generationany-step samplingconsistency distillation

0 likes · 13 min read

AnyFlow: Generate High‑Quality Video in 4 Steps with Unlimited Sampling Improvement

AIWalker

May 20, 2026 · Artificial Intelligence

AnyFlow: Generate High‑Quality Video in 4 Steps and Keep Improving with More Sampling

AnyFlow introduces a flow‑map distillation framework that enables video diffusion models to produce high‑quality results in just four sampling steps while still gaining quality as the number of steps increases, supporting both causal and bidirectional architectures and scaling up to 14 B parameters.

bidirectional videocausal videofew-step generation

0 likes · 14 min read

AnyFlow: Generate High‑Quality Video in 4 Steps and Keep Improving with More Sampling

AIWalker

May 19, 2026 · Artificial Intelligence

How EUPE’s Three‑Stage Distillation Lets an 86M Model Run Classification, Segmentation and VLM on iPhone in 62 ms (SOTA)

EUPE introduces a three‑stage “scale‑then‑shrink” distillation pipeline that first trains a large proxy model to absorb heterogeneous expert knowledge and then compresses it into an 86M encoder, achieving state‑of‑the‑art performance on image classification, dense prediction and vision‑language tasks on an iPhone with only 62 ms latency.

EUPEEdge AIKnowledge Distillation

0 likes · 16 min read

How EUPE’s Three‑Stage Distillation Lets an 86M Model Run Classification, Segmentation and VLM on iPhone in 62 ms (SOTA)

AIWalker

May 19, 2026 · Industry Insights

How TCL’s XRGB Achieves BT.2020 131% Color Gamut with a Four‑Color Pixel Architecture

TCL Huaxing’s XRGB technology adds an independent cyan sub‑pixel to the traditional RGB layout, forming an RGBC four‑color architecture, custom cyan color‑filter and backlight, and a dedicated color‑mapping algorithm that together deliver BT.2020 131% gamut, 7000:1 contrast, 0.7% reflectivity and true‑4K resolution, redefining LCD display limits.

BT.2020LCD displayRGBC

0 likes · 7 min read

How TCL’s XRGB Achieves BT.2020 131% Color Gamut with a Four‑Color Pixel Architecture

AIWalker

May 19, 2026 · Artificial Intelligence

Why Attention Transfer Fails for DINOv2 and Other Modern ViTs: Architecture Mismatch Revealed

A large-scale benchmark of 20 pretrained ViT teachers across 11 families shows that attention copy and distillation improve some models but hurt others—especially DINOv2, CLIP, and BEiTv2—due to architecture mismatches, and adding the teachers' native components to students restores the lost performance.

Architecture CompatibilityAttention TransferKnowledge Distillation

0 likes · 13 min read

Why Attention Transfer Fails for DINOv2 and Other Modern ViTs: Architecture Mismatch Revealed

AIWalker

May 18, 2026 · Artificial Intelligence

ByteDance Teams with He Kaiming to Open‑Source the Continuous Diffusion Language Model Cola DLM

The article analyzes ByteDance's Cola DLM, a fully open‑source continuous diffusion language model that abandons token‑centric generation in favor of latent semantic representations, detailing its architecture, training strategy, scaling stability, and how it compares with the earlier ELF model.

ByteDanceCola DLMcontinuous diffusion

0 likes · 14 min read

ByteDance Teams with He Kaiming to Open‑Source the Continuous Diffusion Language Model Cola DLM

AIWalker

May 17, 2026 · Industry Insights

Why Converting SDR to HDR Involves More Than Just Brightening the Image

The paper presents a pixel‑level statistical study of the ASC StEM2 test film, building a three‑layer physical‑perceptual comparison of EXR, SDR and HDR masters, revealing that about 82 % of image regions can be restored through a restrained restoration process while the remaining areas require targeted semantic adjustments, offering concrete guidance for AI‑driven HDR conversion and industry standards.

Artificial IntelligenceDigital CinemaHDR

0 likes · 29 min read

Why Converting SDR to HDR Involves More Than Just Brightening the Image

AIWalker

May 17, 2026 · Artificial Intelligence

From Image Captioning to Detective‑Style Perception: Pixel‑Searcher Beats Closed‑Source Models

Pixel‑Searcher introduces an agentic search‑driven visual perception framework that integrates web‑based evidence with pixel‑level grounding, and the new WebEyes benchmark demonstrates its superiority over existing open‑ and closed‑source multimodal models across localization, segmentation, and VQA tasks.

MultimodalPixel-SearcherWebEyes

0 likes · 16 min read

From Image Captioning to Detective‑Style Perception: Pixel‑Searcher Beats Closed‑Source Models

AIWalker

May 16, 2026 · Artificial Intelligence

Qwen3-VL-Seg Unlocks Pixel‑Level Open‑World Segmentation

Qwen3-VL-Seg, the latest open‑source multimodal LLM from Alibaba, extends bounding‑box predictions to pixel‑accurate masks using a lightweight box‑guided decoder, achieving strong performance on both closed‑set and open‑world segmentation tasks with only 0.4% extra parameters.

Qwen3-VL-SegSA1B-ORS datasetbox‑guided decoder

0 likes · 6 min read

Qwen3-VL-Seg Unlocks Pixel‑Level Open‑World Segmentation

AIWalker

Apr 20, 2026 · Artificial Intelligence

How VA‑π Bridges Tokenizers and Autoregressive Generators for Pixel‑Perfect Images

VA‑π introduces a lightweight post‑training framework that uses variational inference and reinforcement learning to align tokenizers with visual autoregressive generators, achieving dramatic quality gains, extreme training efficiency, and robust pixel‑level reconstruction across diverse image generation tasks.

Autoregressive ModelsPixel Alignmentpost-training

0 likes · 14 min read