Author

Old Zhang's AI Learning

AI practitioner specializing in large-model evaluation and on-premise deployment, agents, AI programming, Vibe Coding, general AI, and broader tech trends, with daily original technical articles.

210

Articles

Likes

266

Views

Comments

Latest from Old Zhang's AI Learning

100 recent articles max

Old Zhang's AI Learning

May 7, 2026 · Artificial Intelligence

How Unsloth and NVIDIA Boost Consumer‑GPU LLM Training by ~25% with Three Simple Optimizations

Unsloth and NVIDIA identified three low‑level bottlenecks in LLM fine‑tuning on consumer GPUs—repeated packed‑sequence metadata construction, serialized copy‑and‑compute during gradient checkpointing, and per‑expert routing overhead in MoE—and applied targeted patches that together deliver roughly a 25% speedup without changing hardware, code, or frameworks.

GPU optimizationLLM trainingMixture of Experts

0 likes · 12 min read

How Unsloth and NVIDIA Boost Consumer‑GPU LLM Training by ~25% with Three Simple Optimizations

Old Zhang's AI Learning

May 6, 2026 · Artificial Intelligence

Google Boosts Gemma 4 Inference Speed Up to 3× with MTP Drafter and Day‑0 vLLM Support

Google’s new Multi‑Token Prediction (MTP) drafter for Gemma 4 delivers up to three‑fold inference speedups across hardware and frameworks—validated by official benchmarks and independent DGX Spark tests—while preserving identical output quality, and is immediately usable via Hugging Face, vLLM, MLX, Ollama and edge‑device runtimes.

Apple SiliconGemma 4LLM inference

0 likes · 9 min read

Google Boosts Gemma 4 Inference Speed Up to 3× with MTP Drafter and Day‑0 vLLM Support

Old Zhang's AI Learning

May 6, 2026 · Information Security

Why Large‑Model AI Agents Need Strict Security Controls

The article compares AWS Rex, which enforces Cedar policies on Rhai scripts, with Vercel deepsec, which lets powerful coding agents hunt vulnerabilities, showing how both defensive and offensive approaches are shaping the emerging security model for AI agents in production.

AI agentsCedarRex

0 likes · 12 min read

Why Large‑Model AI Agents Need Strict Security Controls

Old Zhang's AI Learning

May 6, 2026 · Artificial Intelligence

GPT-5.5 Instant Arrives: Smarter, Clearer, More Personalized AI

OpenAI has silently replaced the default ChatGPT model with GPT‑5.5 Instant, delivering a 52.5% drop in hallucinations, 30% shorter responses, deeper personalization via memory sources, and higher benchmark scores across a range of professional tasks, while rolling out new pricing and usage tiers.

AI benchmarksChatGPTGPT-5.5

0 likes · 11 min read

GPT-5.5 Instant Arrives: Smarter, Clearer, More Personalized AI

Old Zhang's AI Learning

May 6, 2026 · Artificial Intelligence

Solving RAG’s Biggest Pain Point: Introducing the Open‑Source CocoIndex

RAG and agent contexts suffer from stale data, not chunking or reranking, and CocoIndex—a Rust‑based incremental engine with a declarative Python API—offers fresh, delta‑processed context, automatic schema evolution, and production‑grade features, demonstrated through PDF‑to‑Markdown pipelines and a podcast knowledge‑graph case study.

PythonRAGRust

0 likes · 13 min read

Solving RAG’s Biggest Pain Point: Introducing the Open‑Source CocoIndex

Old Zhang's AI Learning

May 6, 2026 · Frontend Development

Testing Open‑Slide: A React‑Based PPT Framework Built for AI Agents

Open‑slide is a React and Tailwind powered slide framework designed for AI coding agents such as Claude Code, allowing natural‑language prompts to generate 1920×1080 decks with agent‑native authoring, inspector comments, asset management, presenter mode, static deployment, and a hands‑on evaluation of its strengths and limitations.

AI agentsClaude CodeFrontend

0 likes · 11 min read

Testing Open‑Slide: A React‑Based PPT Framework Built for AI Agents

Old Zhang's AI Learning

May 5, 2026 · Artificial Intelligence

Claude Enters Finance: 10 Open‑Source Financial Agent Templates Unveiled

Anthropic released ten ready‑to‑use financial Agent templates that bundle skills, data connectors and sub‑agents, can run natively in Excel, PowerPoint, Word and Outlook, are open‑sourced on GitHub, support two deployment modes, score 64.37% on the Vals AI finance benchmark, and integrate dozens of market data sources, while offering both strengths and notable limitations.

Agent TemplatesClaudeData Connectors

0 likes · 14 min read

Claude Enters Finance: 10 Open‑Source Financial Agent Templates Unveiled

Old Zhang's AI Learning

May 5, 2026 · Artificial Intelligence

Why the Mysteriously Popular DeepSeek‑TUI Open‑Source Coding Agent Is Gaining Traction in China

DeepSeek‑TUI, a Rust‑based terminal coding agent built on DeepSeek‑V4, has unexpectedly gone viral in China thanks to its native RLM, full toolset, Chinese‑friendly installation, and the author’s candid use of AI‑generated Chinese to engage the local developer community.

AI coding agentCLIDeepSeek

0 likes · 10 min read

Why the Mysteriously Popular DeepSeek‑TUI Open‑Source Coding Agent Is Gaining Traction in China

Old Zhang's AI Learning

May 5, 2026 · Artificial Intelligence

vLLM 0.20.1 Fixes Instability and Speed Issues for DeepSeek V4

The vLLM 0.20.1 patch, released shortly after 0.20.0, consolidates stability fixes and performance optimizations for DeepSeek V4, adds several bug fixes, updates installation instructions, and provides targeted upgrade recommendations for different user scenarios.

Bug FixDeepSeek V4GPU inference

0 likes · 9 min read

vLLM 0.20.1 Fixes Instability and Speed Issues for DeepSeek V4

Old Zhang's AI Learning

May 4, 2026 · Artificial Intelligence

How DeepSeek’s New Paper Redefines Multimodal Reasoning with Visual Primitives

DeepSeek’s new paper "Thinking with Visual Primitives" tackles the reference gap in multimodal models by introducing points and boxes as reasoning units, achieving up to 8× token efficiency and leading benchmark scores in counting, spatial reasoning, and maze navigation compared with GPT‑5.4, Claude‑Sonnet‑4.6 and Gemini‑3‑Flash.

DeepSeekMultimodalVisual Primitives

0 likes · 10 min read

How DeepSeek’s New Paper Redefines Multimodal Reasoning with Visual Primitives