SenseTime’s ‘Big Device’ Powers the Leap of Chinese AI from Usable to Practical

The article explains how DeepSeek V4’s delayed launch was a strategic move to fully adapt to Huawei’s Ascend chips, with SenseTime’s ‘Big Device’ acting as middleware that fine‑tunes hardware‑level scheduling, enabling million‑token contexts and bringing Chinese AI performance closer to Nvidia‑based systems, while noting remaining throughput challenges.

AI infrastructureChinese AIDeepSeek V4

0 likes · 7 min read

SenseTime’s ‘Big Device’ Powers the Leap of Chinese AI from Usable to Practical

DeepHub IMBA

Mar 13, 2026 · Artificial Intelligence

Why Bigger Context Windows Make RAG Essential, Not Redundant

Although expanding LLM context windows seems to eliminate the need for Retrieval‑Augmented Generation, in practice larger windows dilute attention and cause retrieval failures, so RAG remains crucial for filtering high‑signal content and maintaining answer quality.

AI ArchitectureAttention DilutionLLM

0 likes · 7 min read

Why Bigger Context Windows Make RAG Essential, Not Redundant

Programmer's Advance

Jan 12, 2026 · Artificial Intelligence

DeepSeek V4 Review: Open‑Source 1‑Trillion‑Parameter Model That Beats Claude & GPT for Developers

DeepSeek V4, the upcoming open‑source 1‑trillion‑parameter coding model, claims to surpass Claude and GPT with innovations like mHC, DSA and MoE, offering 1 M‑plus token context, 10× faster inference, and dramatically lower API costs—making it a game‑changer for most developers while reserving local deployment for only a few large enterprises.

AI coding modelAPI vs local deploymentDeepSeek V4

0 likes · 19 min read

DeepSeek V4 Review: Open‑Source 1‑Trillion‑Parameter Model That Beats Claude & GPT for Developers

Ops Development & AI Practice

Aug 5, 2024 · Artificial Intelligence

What Makes Google Gemini 1.5 Pro a Game‑Changer? 2M‑Token Context & Code Execution

Google Gemini 1.5 Pro pushes AI forward with a 2‑million‑token context window, built‑in Python code execution, the developer‑friendly Gemma 2, and a cost‑effective Flash variant, expanding real‑world applications from legal analysis to scientific research.

AI modelsAI productivityCode Execution

0 likes · 7 min read

What Makes Google Gemini 1.5 Pro a Game‑Changer? 2M‑Token Context & Code Execution