Old Zhang's AI Learning
Author

Old Zhang's AI Learning

AI practitioner specializing in large-model evaluation and on-premise deployment, agents, AI programming, Vibe Coding, general AI, and broader tech trends, with daily original technical articles.

210
Articles
0
Likes
266
Views
0
Comments
Recent Articles

Latest from Old Zhang's AI Learning

100 recent articles max
Old Zhang's AI Learning
Old Zhang's AI Learning
May 31, 2026 · Artificial Intelligence

Qwen3.6-35B-A3B NVFP4: A Stable, Highly Compressed Quantized Model

NVIDIA's NVFP4 quantization reduces Qwen3.6-35B-A3B's memory footprint by threefold with almost no accuracy loss, offers plug‑and‑play deployment via vLLM, and outperforms other 4‑bit formats on Hopper/Blackwell GPUs, making it a practical choice for production AI workloads.

MoENVFP4Qwen3.6-35B-A3B
0 likes · 13 min read
Qwen3.6-35B-A3B NVFP4: A Stable, Highly Compressed Quantized Model
Old Zhang's AI Learning
Old Zhang's AI Learning
May 31, 2026 · Artificial Intelligence

vLLM 0.22 Release: Production-Ready DeepSeek V4 and Extreme KV Cache Compression

The vLLM 0.22 stable release introduces production‑grade DeepSeek V4 support, massive kernel fusions, up to 10‑20× speedups, Batch Invariance with 28.9% latency gain, a Rust front‑end, multi‑level KV cache offload that can double context length, and broad hardware coverage across NVIDIA, AMD, CPU and RISC‑V, making it a pivotal upgrade for inference infrastructure teams.

Batch InvarianceDeepSeek V4Inference Optimization
0 likes · 13 min read
vLLM 0.22 Release: Production-Ready DeepSeek V4 and Extreme KV Cache Compression
Old Zhang's AI Learning
Old Zhang's AI Learning
May 30, 2026 · Artificial Intelligence

vLLM Introduces Native RL API for Seamless Weight Synchronization

vLLM’s new native RL API introduces a four‑stage weight‑transfer protocol, pluggable backends, and a keep‑mode pause/resume mechanism that eliminates deadlocks in DPEP deployments, with large‑scale validations on SkyRL and Prime‑RL demonstrating reliability and performance gains.

CUDA IPCNCCLRL API
0 likes · 14 min read
vLLM Introduces Native RL API for Seamless Weight Synchronization
Old Zhang's AI Learning
Old Zhang's AI Learning
May 30, 2026 · Artificial Intelligence

Set Up an Entire AI Development Pipeline with a Single Command

AI Factory is an npm package that automates the configuration of a full AI development pipeline—detecting project stacks, installing required skills and services, and providing a spec‑driven, multi‑agent workflow with planning, implementation, verification, and handoff commands—so developers can focus on writing requirements.

AI agentsAI developmentautomation
0 likes · 9 min read
Set Up an Entire AI Development Pipeline with a Single Command
Old Zhang's AI Learning
Old Zhang's AI Learning
May 29, 2026 · Artificial Intelligence

Run Your Own AI‑Powered Company with 170+ Ready‑to‑Work Agents

The article reviews the open‑source “The Agency” repository, which bundles over 170 AI‑agent subagents across 17 departments—from engineering and design to marketing and sales—providing role‑based prompts, SOPs, and deliverables for Claude Code and other tools, and shares installation steps, usage examples, and practical tips.

AI agentsClaude CodeOpen Source
0 likes · 10 min read
Run Your Own AI‑Powered Company with 170+ Ready‑to‑Work Agents
Old Zhang's AI Learning
Old Zhang's AI Learning
May 29, 2026 · Artificial Intelligence

How NVIDIA’s Polar Enables Any Agent Framework to Plug Into Reinforcement Learning

Integrating diverse AI agent harnesses into reinforcement‑learning pipelines is notoriously labor‑intensive, but NVIDIA’s new Polar system inserts an API‑proxy layer that treats any harness as a black box, enabling seamless rollout recording and trajectory reconstruction, as demonstrated by dramatic performance gains on a 4B model across multiple harnesses.

AI agentAPI ProxyNVIDIA
0 likes · 10 min read
How NVIDIA’s Polar Enables Any Agent Framework to Plug Into Reinforcement Learning
Old Zhang's AI Learning
Old Zhang's AI Learning
May 29, 2026 · Artificial Intelligence

How I Got an AI Agent to Open a Browser, Scrape Hugging Face Papers, and Auto‑Post to X

This article reviews LocoAgent, an open‑source AI‑powered social‑media agent that uses real Chrome sessions to fetch Hugging Face daily papers, process them with a lightweight model, and automatically post summaries to X via customizable workflows, detailing setup, execution, and observed results.

AI agentHugging FaceSocial Media
0 likes · 8 min read
How I Got an AI Agent to Open a Browser, Scrape Hugging Face Papers, and Auto‑Post to X
Old Zhang's AI Learning
Old Zhang's AI Learning
May 28, 2026 · Artificial Intelligence

How Anthropic Contains Claude: Three Isolation Strategies Explained

Anthropic’s engineering blog reveals that securing powerful AI agents like Claude requires focusing on blast radius and implementing three layered defenses—model, environment, and external content—through distinct isolation approaches, hard OS sandboxes, and practical lessons from real‑world pitfalls.

AI agent securityAnthropicClaude
0 likes · 11 min read
How Anthropic Contains Claude: Three Isolation Strategies Explained