Tagged articles

2074 articles

Page 7 of 21

Feb 23, 2026 · Information Security

PentAGI: AI‑Powered Penetration Testing Platform Integrates 20+ Tools to Redefine Security Assessments

PentAGI is an open‑source, AI‑driven penetration testing platform released by VXControl in early 2025 that automatically orchestrates over twenty security tools—including Nmap, Metasploit, sqlmap—and generates comprehensive reports within isolated Docker environments, offering advanced agent architecture, real‑time intelligence gathering, and scalable deployment options.

AI penetration testingDockerLLM

0 likes · 5 min read

PentAGI: AI‑Powered Penetration Testing Platform Integrates 20+ Tools to Redefine Security Assessments

AI Tech Publishing

Feb 23, 2026 · Artificial Intelligence

Final Lesson: Build a Fully Working RSS News Brief Agent

In this final lesson of a nine‑day Agent engineering series, the author integrates the full Agent Loop, tools, MCP, skills, RAG, context handling, multi‑turn dialogue, and multi‑agent coordination to create a runnable RSS news‑briefing Agent that fetches feeds in parallel, filters content with LLMs, summarizes articles, and outputs a markdown report.

Agent CoordinationLLMParallel Fetching

0 likes · 12 min read

Final Lesson: Build a Fully Working RSS News Brief Agent

Old Zhang's AI Learning

Feb 22, 2026 · Artificial Intelligence

Why the Minimalist pi Coding Agent Beats Feature‑Heavy Claude Code and OpenCode

In a landscape where Claude Code, Cursor and Windsurf pile on built‑in features, the pi terminal coding agent adopts a minimalist "primitives, not features" philosophy, removes most defaults, offers extensible CLI tools, supports dozens of LLM providers, and outperforms its rivals in Terminal‑Bench 2.0.

Coding AgentLLMTerminal-Bench

0 likes · 16 min read

Why the Minimalist pi Coding Agent Beats Feature‑Heavy Claude Code and OpenCode

AI Tech Publishing

Feb 22, 2026 · Artificial Intelligence

Mastering Multi‑Agent Collaboration: Handoff Mode and Coordination

This lesson explains how to extend a single‑agent system with multi‑agent collaboration, covering context isolation, Handoff and Router patterns, flat coordinator architecture, code examples, task decomposition, and practical run‑time demos for building complex AI workflows.

AICoordinatorHandoff

0 likes · 20 min read

Mastering Multi‑Agent Collaboration: Handoff Mode and Coordination

Machine Learning Algorithms & Natural Language Processing

Feb 21, 2026 · Industry Insights

Why the App Store Model Is Obsolete: Karpathy’s Radical Call for On‑Demand App Creation

Karpathy argues that as LLM agents can instantly generate highly customized software, the traditional App Store model of discrete downloadable apps is becoming outdated, sparking debate over AI‑native services, sensor APIs, and the future of on‑demand, temporary applications.

AI agentsAI-native CLIApp Store

0 likes · 8 min read

Why the App Store Model Is Obsolete: Karpathy’s Radical Call for On‑Demand App Creation

PaperAgent

Feb 21, 2026 · Artificial Intelligence

Why Millions of LLM Agents Still Fail to Form a Real Society

An in‑depth analysis of the Moltbook platform shows that even with 2.6 million autonomous LLM agents interacting for months, large‑scale interaction does not automatically lead to genuine social structures, revealing three layers of socialization failure and offering a three‑dimensional diagnostic framework for AI societies.

AI agentsAI societyDiagnostic framework

0 likes · 9 min read

Why Millions of LLM Agents Still Fail to Form a Real Society

Architect

Feb 20, 2026 · Artificial Intelligence

How Agent Loops Give AI Agents a Personality: Engineering Secrets Revealed

This article explains how the Agent Loop—an engineered while‑loop that repeatedly calls an LLM, decides when to use tools, executes them, and feeds results back—creates persistence, style, memory, judgment, and safety boundaries that together make an AI agent feel like it has its own personality.

AI Agent EngineeringAgent LoopLLM

0 likes · 24 min read

How Agent Loops Give AI Agents a Personality: Engineering Secrets Revealed

Open Source Tech Hub

Feb 20, 2026 · Artificial Intelligence

How to Build AI Agents in PHP with the Model Context Protocol (MCP)

Learn how to connect PHP-based AI agents to the Model Context Protocol (MCP) using the open‑source Neuron AI framework, covering MCP fundamentals, server setup, tool integration, and example code for creating custom agents that can invoke external APIs, databases, and web content.

AI agentsLLMMCP

0 likes · 12 min read

How to Build AI Agents in PHP with the Model Context Protocol (MCP)

21CTO

Feb 19, 2026 · Fundamentals

Why Compilers Still Matter: Debunking Musk’s ‘Code‑Free’ Future

The article traces Grace Hopper’s pioneering compiler work, critiques Elon Musk’s claim that AI will eliminate coding, explains how modern compilers transform source code through multiple deterministic stages, and argues that source code remains essential despite advances in large language models.

Code OptimizationLLMcompilers

0 likes · 17 min read

Why Compilers Still Matter: Debunking Musk’s ‘Code‑Free’ Future

Black & White Path

Feb 19, 2026 · Information Security

How AI Cracks AWS in Under 8 Minutes, Rendering Cloud Defenses Useless

A Sysdig report shows that attackers using large language models can steal credentials, elevate privileges, move laterally across 19 AWS accounts, hijack Amazon Bedrock models, and abuse GPU resources—all within eight minutes, leaving traditional cloud defenses with virtually no response window.

AIGPU abuseLLM

0 likes · 6 min read

How AI Cracks AWS in Under 8 Minutes, Rendering Cloud Defenses Useless

Machine Learning Algorithms & Natural Language Processing

Feb 18, 2026 · Artificial Intelligence

Microsoft’s 671B LLM Unifies Offline Ad Tasks—Can It Cut Compute Costs?

Microsoft’s AdNanny replaces a forest of specialized offline models with a single 671 B LLM, using a three‑stage data factory to generate reasoning‑rich corpora, dynamic task re‑weighting, RL‑based metric alignment, and a hybrid 31‑pipeline‑parallel architecture that halves compute cost while boosting performance on core ad‑ranking tasks.

AdNannyLLMdynamic weighting

0 likes · 9 min read

Microsoft’s 671B LLM Unifies Offline Ad Tasks—Can It Cut Compute Costs?

AI Engineering

Feb 17, 2026 · Artificial Intelligence

Claude Sonnet 4.6: Million‑Token Context, Human‑Level Computer Skills, Near‑Opus Performance

Claude Sonnet 4.6, Anthropic’s latest model, introduces a beta‑stage million‑token window and markedly better coding, computer‑use and long‑context reasoning, scoring 72.5% on OSWorld versus 14.9% for Sonnet 3.5, while offering Excel connectors, dynamic search filtering, stronger prompt‑injection resistance, and a pricing tier that makes it a strong alternative to Opus for many workloads.

AI codingAPIClaude

0 likes · 4 min read

Claude Sonnet 4.6: Million‑Token Context, Human‑Level Computer Skills, Near‑Opus Performance

Machine Learning Algorithms & Natural Language Processing

Feb 17, 2026 · Artificial Intelligence

Beyond Single LLMs: MoCo, a Multi‑Model Collaboration Framework

MoCo is an open‑source Python framework that unifies 26 algorithms across four collaboration levels, enabling researchers to scale model ensembles from 2 to 16 LLMs, explore diversity benefits, and solve tasks that single models cannot handle.

AI scalingLLMMoCo

0 likes · 7 min read

Beyond Single LLMs: MoCo, a Multi‑Model Collaboration Framework

AI Insight Log

Feb 17, 2026 · Artificial Intelligence

Qwen 3.5 Launches on New Year’s Eve as DeepSeek Only Sends a Holiday Greeting

On Chinese New Year's Eve, Alibaba's Qwen 3.5 open‑source model—featuring a 397 billion‑parameter backbone with a 17 billion‑parameter active set, hybrid linear attention, and sparse MoE—was released under Apache 2.0, delivering 8.6‑19× faster inference, top‑tier agent, code and multimodal scores, and rapid integration across major AI platforms.

Apache-2.0LLMOpen Source

0 likes · 11 min read

Qwen 3.5 Launches on New Year’s Eve as DeepSeek Only Sends a Holiday Greeting

Black & White Path

Feb 17, 2026 · Information Security

AI-Generated Malware Exploits React2Shell to Attack Docker: A Low‑Barrier Threat Surge

A Darktrace‑detected campaign shows AI‑generated malware leveraging the React2Shell vulnerability to compromise an intentionally exposed Docker daemon, download LLM‑crafted payloads, and install XMRig mining software, highlighting a new low‑skill threat vector that evades traditional signature defenses.

AI-generated malwareDockerLLM

0 likes · 5 min read

AI-Generated Malware Exploits React2Shell to Attack Docker: A Low‑Barrier Threat Surge

AI Cyberspace

Feb 16, 2026 · Artificial Intelligence

Unlocking Claude’s Power: A Deep Dive into Agent Skills and Their Architecture

This article explains the concept, design, implementation, and best‑practice guidelines of Anthropic’s Claude Agent Skills, compares them with the MCP protocol, and provides practical instructions for creating, installing, and using Skills to extend large‑language‑model capabilities efficiently.

Agent SkillsClaudeLLM

0 likes · 18 min read

Unlocking Claude’s Power: A Deep Dive into Agent Skills and Their Architecture

AI Tech Publishing

Feb 15, 2026 · Artificial Intelligence

Mastering Agent Tool Use: Adding Search, Time, and Calculator Functions

This tutorial extends a minimal LLM Agent loop by introducing Tool Use (function calling) to give the agent actionable capabilities—searching the web, retrieving the current datetime, and performing mathematical calculations—while explaining the BaseTool architecture, registration process, system‑prompt adjustments, and practical execution examples.

AI AgentBaseToolFunction Calling

0 likes · 15 min read

Mastering Agent Tool Use: Adding Search, Time, and Calculator Functions

PaperAgent

Feb 15, 2026 · Artificial Intelligence

How MiniCPM‑SALA Merges Sparse and Linear Attention to Break Long‑Context Limits

MiniCPM‑SALA introduces a hybrid sparse‑linear attention architecture that reduces quadratic compute and memory costs, achieves state‑of‑the‑art performance on long‑context benchmarks, and delivers up to 3.5× faster inference than full‑attention models on sequences up to 1 million tokens.

LLMLinear AttentionSparse Attention

0 likes · 17 min read

How MiniCPM‑SALA Merges Sparse and Linear Attention to Break Long‑Context Limits

AI Insight Log

Feb 14, 2026 · Artificial Intelligence

ByteDance Unveils Doubao 2.0 Pro: A Domestic Model Taking on GPT‑5.2

ByteDance's Seed 2.0 Pro (Doubao 2.0) showcases industry‑leading performance on math, vision, document, long‑video, and code benchmarks, dramatically lowers inference cost, and is now available in the Doubao app and Trae IDE, positioning it as a serious challenger to GPT‑5.2 and other top LLMs.

AIByteDanceDoubao

0 likes · 7 min read

ByteDance Unveils Doubao 2.0 Pro: A Domestic Model Taking on GPT‑5.2

DataFunTalk

Feb 14, 2026 · Artificial Intelligence

Memory‑Based Self‑Evolution: Enabling AI Agents to Learn Like Humans

This article explores a new agent‑optimization paradigm—Memory‑Based Self‑Evolution—detailing how dynamic memory systems such as Dynamic Cheatsheet, ReasoningBank, ACE, and MemGen transform LLM agents from static, parameter‑only models into continuously learning entities that can adapt to real‑world data, with a focus on insurance industry applications.

Agent MemoryInsurance AILLM

0 likes · 13 min read

Memory‑Based Self‑Evolution: Enabling AI Agents to Learn Like Humans

Old Zhang's AI Learning

Feb 14, 2026 · Artificial Intelligence

Translate Full PDFs While Preserving Layout Using LLMs – Core Code Included

This article presents a two‑stage, cache‑enabled pipeline that extracts text blocks from a PDF with PyMuPDF, translates them via a large‑language‑model API, and re‑renders each page as an image with Chinese text overlaid to keep the original layout, along with full Python code and usage instructions.

LLMLarge Language ModelPDF translation

0 likes · 10 min read

Translate Full PDFs While Preserving Layout Using LLMs – Core Code Included

AI Engineering

Feb 14, 2026 · Artificial Intelligence

DeepSeek‑V4‑Lite‑285B Hits 100% Recall in 256K Token Tests – A Needle‑in‑a‑Haystack Benchmark

Community testing of DeepSeek's rumored V4‑Lite‑285B model using the OpenAI MRCR 8‑pin standard shows perfect 1.0000 scores on several 128K‑token samples and a 256K‑token sample, achieving 100% recall in native 256K context while longer contexts drop to about 60%, with a note that the "needle‑in‑a‑haystack" method may be exploitable by DSA mechanisms.

DeepSeekLLMlong-context

0 likes · 3 min read

DeepSeek‑V4‑Lite‑285B Hits 100% Recall in 256K Token Tests – A Needle‑in‑a‑Haystack Benchmark

Alibaba Cloud Developer

Feb 14, 2026 · Artificial Intelligence

Revamping AliGo’s AI Travel Assistant: Multi‑Agent Architecture & Prompt Engineering

The AliGo travel platform upgraded its AI assistant by replacing a single‑agent workflow with a modular multi‑agent system, introducing dynamic prompt generation, real‑time reasoning chains, context sharing, observability, and a knowledge base, which dramatically improved accuracy, stability, and user experience.

AI ArchitectureAgentScopeKnowledge Base

0 likes · 19 min read

Revamping AliGo’s AI Travel Assistant: Multi‑Agent Architecture & Prompt Engineering

PaperAgent

Feb 12, 2026 · Artificial Intelligence

How GLM-5 Turns LLMs into System‑Architect Agents: A Deep Technical Review

An in‑depth analysis shows how GLM‑5 surpasses traditional code‑generation LLMs by autonomously designing, implementing, and debugging complex multi‑agent systems, from a fireworks HTML demo to a 35,000‑line TrustGraph refactor, highlighting its architecture, tool integration, and cost‑effective advantages.

AI codingBackend DevelopmentLLM

0 likes · 9 min read

How GLM-5 Turns LLMs into System‑Architect Agents: A Deep Technical Review

DataFunTalk

Feb 11, 2026 · Artificial Intelligence

Why Most RAG Deployments Fail and How to Build a Production‑Ready RAG System

This round‑table dissects the gap between RAG’s hype and real‑world production, exposing common pitfalls such as low recall, hallucinations and cost overruns, and then delivers a systematic diagnostic framework, hybrid search strategies, fine‑tuning rules, and practical best‑practice roadmaps for building reliable enterprise RAG solutions.

Agentic RAGHybrid SearchLLM

0 likes · 20 min read

Why Most RAG Deployments Fail and How to Build a Production‑Ready RAG System

High Availability Architecture

Feb 11, 2026 · Artificial Intelligence

Choosing the Right Sandbox Architecture for AI Agents: Inside vs. Tool Mode

The article explains two sandbox integration patterns for AI agents—running the agent inside a sandbox or using the sandbox as an external tool—detailing their advantages, trade‑offs, security implications, and practical implementation with the open‑source deepagents framework.

AI agentsDeepAgentsLLM

0 likes · 13 min read

Choosing the Right Sandbox Architecture for AI Agents: Inside vs. Tool Mode

DaTaobao Tech

Feb 9, 2026 · Artificial Intelligence

Boosting Trustworthiness in Retrieval‑Augmented Generation: The Trustworthy Generation Design Pattern

This article presents the Trustworthy Generation design pattern for Retrieval‑Augmented Generation (RAG) systems, analyzes four root causes of low trustworthiness—retrieval errors, content reliability, pre‑retrieval reasoning mistakes, and model hallucinations—and proposes layered solutions, citation techniques, CRAG and Self‑RAG architectures, guardrails, and practical trade‑offs.

AI safetyLLMRAG

0 likes · 16 min read

Boosting Trustworthiness in Retrieval‑Augmented Generation: The Trustworthy Generation Design Pattern

Data Party THU

Feb 9, 2026 · Artificial Intelligence

Aligning Collaborative Filtering with LLM Token Generation: The TCA4Rec Breakthrough

This paper introduces the TCA4Rec framework that directly aligns item‑level collaborative‑filtering preferences with token‑level objectives of large language models, presenting novel modules, extensive experiments, and analysis that demonstrate significant performance gains in generative recommendation tasks.

Generative RecommendationLLMRecommendation Systems

0 likes · 9 min read

Aligning Collaborative Filtering with LLM Token Generation: The TCA4Rec Breakthrough

PaperAgent

Feb 9, 2026 · Artificial Intelligence

Can Online Evaluation Unlock AI Assistants' Long-Term Memory? Inside AMemGym

AMemGym introduces an on‑policy, interactive benchmark that evaluates and trains AI assistants' long‑term memory by structuring state evolution, diagnosing memory failures, and enabling agents to self‑evolve, revealing that selective memory writing outperforms passive approaches across various LLM and agent architectures.

AI memoryLLMagent

0 likes · 8 min read

Can Online Evaluation Unlock AI Assistants' Long-Term Memory? Inside AMemGym

Shuge Unlimited

Feb 9, 2026 · Artificial Intelligence

Build Agent Workflows in 3 Minutes with Refly’s Natural‑Language Builder

Refly is an open‑source Agent Skills Builder that lets you create production‑grade AI workflows in minutes using natural language, offering versioned, reusable skills, runtime intervention, extensive tool integrations, and export options that outperform traditional automation platforms.

AI automationAgent workflowLLM

0 likes · 16 min read

Build Agent Workflows in 3 Minutes with Refly’s Natural‑Language Builder

Data Party THU

Feb 8, 2026 · Artificial Intelligence

How LangGraph Turns Multi‑Agent Workflows into Editable Graphs

This article explains LangGraph's graph‑based design, runtime behavior, state management, checkpoint persistence, and flexible workflow modifications, providing concrete code examples and patterns that illustrate why the framework is well‑suited for complex multi‑agent AI systems.

AILLMLangGraph

0 likes · 14 min read

How LangGraph Turns Multi‑Agent Workflows into Editable Graphs

AI Tech Publishing

Feb 8, 2026 · Artificial Intelligence

Why Bigger Context Windows Fail and How Structured Graphs Deliver Precise Fact Retrieval

The article argues that large language models struggle with exact factual answers and that extending context windows often degrades performance, while knowledge graphs provide structured, traceable retrieval; it proposes a unified graph monograph and small, focused context slices to empower LLMs with accurate information.

Context RetrievalKnowledge GraphLLM

0 likes · 10 min read

Why Bigger Context Windows Fail and How Structured Graphs Deliver Precise Fact Retrieval

AI2ML AI to Machine Learning

Feb 7, 2026 · Artificial Intelligence

The Alarming Implication of Claude Opus 4.6: Offline Open‑Source LLMs Are the Strongest Corporate Moat

Claude Opus 4.6 showcases unprecedented finance‑scenario analysis and a powerful Skills integration, prompting the author to argue that enterprises should adopt offline open‑source large language models to safeguard their proprietary prompts and maintain a robust competitive moat.

ClaudeFinanceLLM

0 likes · 4 min read

The Alarming Implication of Claude Opus 4.6: Offline Open‑Source LLMs Are the Strongest Corporate Moat

AI Tech Publishing

Feb 6, 2026 · Artificial Intelligence

2026 Large Model Engineering Roadmap: From Foundations to Production

This roadmap outlines a step‑by‑step learning path for building, optimizing, and safely deploying large language model systems, covering fundamentals, vector stores, RAG, advanced techniques, fine‑tuning, inference speed, deployment, observability, agents, and production safeguards.

DeploymentLLMObservability

0 likes · 5 min read

2026 Large Model Engineering Roadmap: From Foundations to Production

Instant Consumer Technology Team

Feb 6, 2026 · Artificial Intelligence

How AI‑Powered Agentic Labeling Transforms Customer Conversation Tagging

This article details an end‑to‑end AI system that replaces manual, error‑prone tagging of customer dialogues with a large‑language‑model‑driven, vector‑based pipeline that automatically discovers, clusters, and iteratively refines business‑level tags, dramatically cutting cycle time and improving coverage.

HDBSCANLLMMilvus

0 likes · 33 min read

How AI‑Powered Agentic Labeling Transforms Customer Conversation Tagging

PaperAgent

Feb 6, 2026 · Artificial Intelligence

How xMemory Cuts Tokens by 30% While Boosting Agent QA Scores Over 10 Points

The paper introduces xMemory, a hierarchical "split‑aggregate‑retrieve" framework that reduces token usage by up to 30% and improves QA performance by more than 10 points in long‑range agent conversations, outperforming traditional RAG across multiple LLMs.

Agent MemoryHierarchical RetrievalLLM

0 likes · 8 min read

How xMemory Cuts Tokens by 30% While Boosting Agent QA Scores Over 10 Points

Data STUDIO

Feb 6, 2026 · Artificial Intelligence

Building a Basic Chatbot with LangGraph: Step‑by‑Step AI Agent Tutorial

This article walks through building AI agents with LangGraph in Python, starting with a simple GCD workflow and then creating a memory‑enabled chatbot using GPT‑4o, covering state management, nodes, edges, conditional loops, recursion limits, and visual debugging.

AI agentsChatbotLLM

0 likes · 18 min read

Building a Basic Chatbot with LangGraph: Step‑by‑Step AI Agent Tutorial

JD Tech

Feb 5, 2026 · Artificial Intelligence

How OxygenREC Marries Fast and Slow Thinking to Revolutionize E‑commerce Recommendations

OxygenREC presents a fast‑slow thinking, instruction‑following generative framework that overcomes latency, reasoning, and multi‑scene scalability challenges in e‑commerce recommendation, delivering unified training, low‑latency inference, and significant business impact across JD.com scenarios.

Generative AILLMe‑commerce

0 likes · 13 min read

How OxygenREC Marries Fast and Slow Thinking to Revolutionize E‑commerce Recommendations

AI Tech Publishing

Feb 5, 2026 · Artificial Intelligence

From Java Backend to AI Agent Engineer: Essential Knowledge for the Transition

This comprehensive guide walks Java backend developers through the fundamentals of AI agents, comparing agents with traditional workflows, detailing core components such as LLMs, tools, and memory, and exploring practical patterns, frameworks, and code examples to help them successfully shift into AI agent development.

AI agentsAgent FrameworksLLM

0 likes · 35 min read

From Java Backend to AI Agent Engineer: Essential Knowledge for the Transition

Alibaba Cloud Native

Feb 4, 2026 · Artificial Intelligence

Boost Java Agent Performance with End‑to‑End Online Training Using Trinity‑RFT

This article explains how to overcome the training‑deployment gap for Java‑based AI agents by introducing a cloud‑native, low‑intrusion online training pipeline built on AgentScope Java and Trinity‑RFT, detailing architecture, configuration, custom selection and reward strategies, and showing measurable accuracy gains on a SQL‑Agent benchmark.

JavaLLMOnlineTraining

0 likes · 21 min read

Boost Java Agent Performance with End‑to‑End Online Training Using Trinity‑RFT

Alibaba Cloud Developer

Feb 4, 2026 · Artificial Intelligence

Progressive Disclosure: Making Multi‑Skill LLM Agents Efficient and Scalable

This article examines the core challenge of giving large‑language‑model agents many abilities while keeping context size limited, compares three common loading strategies, introduces a progressive‑disclosure skill mechanism with three loading layers, and details its implementation, benefits, limitations, and suitable use cases in AgentScope‑Java.

Context ManagementJavaLLM

0 likes · 17 min read

Progressive Disclosure: Making Multi‑Skill LLM Agents Efficient and Scalable

Baobao Algorithm Notes

Feb 4, 2026 · Artificial Intelligence

Mastering Reinforcement Learning: From Basics to Advanced Agentic RL Techniques

This comprehensive guide walks through reinforcement learning fundamentals, MDP modeling, value functions, Bellman equations, and key algorithms such as Q‑learning, REINFORCE, PPO, DPO, and GRPO, then contrasts LLM‑RL with Agentic‑RL and surveys leading industry frameworks and real‑world applications.

Artificial IntelligenceLLMMachine Learning

0 likes · 42 min read

Mastering Reinforcement Learning: From Basics to Advanced Agentic RL Techniques

JD Cloud Developers

Feb 4, 2026 · Artificial Intelligence

How Deep Research Transforms LLMs into Autonomous AI Researchers

This article examines Deep Research, an AI system that adds autonomous planning and deep reasoning to large language models, enabling them to browse the web, perform long‑chain reasoning, and generate professional, citation‑rich reports for complex tasks such as industry trend analysis and technical competitive research.

AI researchLLMReAct

0 likes · 22 min read

How Deep Research Transforms LLMs into Autonomous AI Researchers

JD Tech Talk

Feb 4, 2026 · Artificial Intelligence

How Deep Research Turns LLMs into Autonomous AI Researchers

This article explains the background, core features, underlying ReAct‑based architecture, and engineering solutions of Deep Research—a system that equips large language models with autonomous planning, long‑chain reasoning, and professional report generation to tackle complex information‑intensive tasks.

AI researchLLMPrompt Engineering

0 likes · 21 min read

How Deep Research Turns LLMs into Autonomous AI Researchers

Wu Shixiong's Large Model Academy

Feb 4, 2026 · Artificial Intelligence

Why LLM Agents Rush to Call Tools and How to Stop Them

The article explains that premature tool calls in LLM agents stem from a data‑distribution bias in fine‑tuning, and it presents practical fixes such as adding non‑tool samples, enforcing a Thought chain, and using negative sampling to teach the model when to think before acting.

LLMThought ChainTool Calling

0 likes · 10 min read

Why LLM Agents Rush to Call Tools and How to Stop Them

Wuming AI

Feb 3, 2026 · Artificial Intelligence

How Short‑Term vs Long‑Term Memory Works in LLM‑Powered Autonomous Agents

This article demystifies short‑term and long‑term memory in LLM‑driven autonomous agents, explaining their mechanisms, limitations, and practical implementations such as sliding windows, summarization, and vector‑based retrieval, while illustrating each concept with concrete Cherry Studio examples and relevant research references.

Cherry StudioLLMPrompt Engineering

0 likes · 7 min read

How Short‑Term vs Long‑Term Memory Works in LLM‑Powered Autonomous Agents

Amap Tech

Feb 3, 2026 · Artificial Intelligence

Building a Scalable AI Agent Smart Task Framework for Offline & Event‑Driven Use

After LLMs entered the deep‑water stage, developers realized that agents must go beyond passive Q&A to support asynchronous, long‑running, and subscribable tasks; this article details the design, architecture, and engineering challenges of the “Xiao Gao Teacher AI Agent” smart‑task system, from event‑driven logic to fault‑tolerant deployment.

AI AgentEvent-Driven ArchitectureLLM

0 likes · 19 min read

Building a Scalable AI Agent Smart Task Framework for Offline & Event‑Driven Use

Data STUDIO

Feb 3, 2026 · Artificial Intelligence

Build a Self‑Thinking AI Agent with LangGraph: A Step‑by‑Step Guide

This tutorial explains how LangGraph adds explicit control‑flow, cycles, and shared state to LLM applications, and walks through building a Strava‑based intelligent training coach with Python code, node definitions, state design, graph assembly, and GitHub Actions deployment.

AI agentsLLMLangGraph

0 likes · 12 min read

Build a Self‑Thinking AI Agent with LangGraph: A Step‑by‑Step Guide

PaperAgent

Feb 3, 2026 · Artificial Intelligence

Relink: Turning GraphRAG into a Dynamic, Query‑Driven Knowledge Graph

Relink introduces a ‘reason‑and‑construct’ paradigm that builds knowledge‑graph paths during inference, combining a high‑precision factual graph with a high‑recall potential‑relation pool, using query‑driven dynamic path expansion and contrastive alignment to markedly improve multi‑hop QA performance and robustness to sparse knowledge.

Dynamic RetrievalGraphRAGKnowledge Graph

0 likes · 8 min read

Relink: Turning GraphRAG into a Dynamic, Query‑Driven Knowledge Graph

Wu Shixiong's Large Model Academy

Feb 3, 2026 · Artificial Intelligence

Why Loss Masking Is the Hidden Key to Effective LLM Fine‑Tuning

The article explains how loss masking in supervised fine‑tuning of large language models prevents the model from learning irrelevant tokens such as user inputs, system prompts, tool outputs, and padding, thereby focusing training on the assistant’s responses and improving performance and generalization.

AI trainingLLMPrompt Engineering

0 likes · 10 min read

Why Loss Masking Is the Hidden Key to Effective LLM Fine‑Tuning

Java Architecture Diary

Feb 2, 2026 · Artificial Intelligence

Why a 10‑Year‑Old Java JSON Library Is Now Targeting LLMs with TOON

json-io, a decade‑old Java JSON library known for zero‑config, circular‑reference support, and lightweight size, has added full TOON (Token‑Oriented Object Notation) read/write capabilities, a token‑efficient format designed for LLMs that can cut serialization costs by 30‑60% and integrates seamlessly with Spring Boot and Spring AI.

AIJavaLLM

0 likes · 9 min read

Why a 10‑Year‑Old Java JSON Library Is Now Targeting LLMs with TOON

Alibaba Cloud Developer

Feb 2, 2026 · Artificial Intelligence

Boosting A/B Experiment Automation: Prompt Engineering Achieves 80% Accuracy

This article details how a production‑grade prompt system powered by large language models was designed to replace manual A/B experiment inspection, introducing a six‑level priority decision tree, robust data preprocessing, and systematic bad‑case analysis that lifted automation accuracy from 68% to over 80% while providing clear, explainable recommendations.

A/B testingData AnalysisLLM

0 likes · 46 min read

Boosting A/B Experiment Automation: Prompt Engineering Achieves 80% Accuracy

AI Waka

Feb 1, 2026 · Artificial Intelligence

Boost LLM Inference Speed: Precision Tricks, Quantization, and Multi‑GPU Strategies

This article reviews practical techniques for accelerating large language model inference—including reduced‑precision formats, post‑training quantization, adapter‑based fine‑tuning, pruning, continuous batch processing, and multi‑GPU deployment—while providing concrete code examples, benchmark results, and guidance on selecting the right approach for production workloads.

GPULLMQuantization

0 likes · 20 min read

Boost LLM Inference Speed: Precision Tricks, Quantization, and Multi‑GPU Strategies

Data Party THU

Feb 1, 2026 · Artificial Intelligence

How AutoLink Turns Schema Linking into an Interactive Database Exploration

AutoLink introduces an autonomous, iterative schema‑linking approach for Text‑to‑SQL that treats schema discovery as a progressive, agent‑driven exploration, dramatically improving recall while cutting token costs, and outperforms existing database‑level and element‑level methods on large benchmarks such as Spider 2.0‑Lite and BIRD.

AutoLinkDatabase ExplorationLLM

0 likes · 19 min read

How AutoLink Turns Schema Linking into an Interactive Database Exploration

Architecture and Beyond

Feb 1, 2026 · Artificial Intelligence

5 High‑ROI Strategies to Supercharge RAG Retrieval Performance

This article outlines five practical engineering strategies—multi‑vector retrieval, manual splitting and labeling, scalar enhancement, context augmentation, and dense‑sparse vector integration—that together address common RAG retrieval bottlenecks and dramatically improve recall stability and answer quality.

BM25EngineeringLLM

0 likes · 17 min read

5 High‑ROI Strategies to Supercharge RAG Retrieval Performance

High Availability Architecture

Feb 1, 2026 · Artificial Intelligence

Inside Clawdbot: How Its Agent, Tool Calls, and Browser Engine Operate

This article provides a deep technical walkthrough of Clawdbot’s architecture, covering its TypeScript CLI core, lane‑based command queue, agent runner, memory system with JSONL and vector search, sandboxed computer control, security allowlist, and the semantic snapshot browser tool.

AI AgentClawdbotLLM

0 likes · 13 min read

Inside Clawdbot: How Its Agent, Tool Calls, and Browser Engine Operate

AI Waka

Jan 31, 2026 · Artificial Intelligence

Build a 2026‑Ready LangGraph AI Agent: A Step‑by‑Step Guide

This tutorial walks you through constructing a LangGraph‑based AI agent for automated Strava training plans, covering core concepts like state, nodes, and edges, detailed workflow steps, Python code examples, conditional graph routing, testing, and deployment via GitHub Actions.

AI AgentLLMLangGraph

0 likes · 18 min read

Build a 2026‑Ready LangGraph AI Agent: A Step‑by‑Step Guide

Data Party THU

Jan 31, 2026 · Artificial Intelligence

Can LLMs Learn While Being Tested? Inside the TTT-Discover Breakthrough

The article examines the Test‑Time Training to Discover (TTT‑Discover) approach, which applies reinforcement learning during inference to let large language models continuously improve on single test problems, and reports strong results across mathematics, GPU kernel optimization, algorithm design, and biology.

AI researchLLMScientific Discovery

0 likes · 9 min read

Can LLMs Learn While Being Tested? Inside the TTT-Discover Breakthrough

Ops Development Stories

Jan 31, 2026 · Artificial Intelligence

Unlock Multi‑Model AI Routing with Claude Code Router: A Complete Setup Guide

This article explains why Claude Code Router is needed, outlines its three core advantages, and provides a step‑by‑step guide—including environment preparation, installation, configuration, and usage—so developers can route requests to multiple LLM providers without an Anthropic account.

AICLILLM

0 likes · 9 min read

Unlock Multi‑Model AI Routing with Claude Code Router: A Complete Setup Guide

DaTaobao Tech

Jan 30, 2026 · Artificial Intelligence

Human‑like LLM Replies for Live Digital Hosts: ASR‑Based Style Transfer and Reward Modeling

This article proposes an ASR‑driven pipeline that creates high‑quality AI‑reply vs. human‑like reply pairs, trains a rewrite model and a reward model, and uses GRPO reinforcement learning to generate natural, helpful, and less AI‑sounding responses in digital‑human live streaming, achieving 92% accuracy and 97% helpfulness while improving user experience.

ASR dataLLMQwen

0 likes · 20 min read

Human‑like LLM Replies for Live Digital Hosts: ASR‑Based Style Transfer and Reward Modeling

Alibaba Cloud Infrastructure

Jan 30, 2026 · Artificial Intelligence

Deploy Kimi 2.5 LLM on Alibaba Cloud with SGLang, RBG, and Openclaw

This guide walks through preparing the Kimi 2.5 model, uploading it to OSS, configuring persistent storage, and using SGLang, RoleBasedGroup, and Openclaw to deploy a production‑grade inference service on Alibaba Cloud Kubernetes with step‑by‑step commands and YAML examples.

AIDeploymentKimi

0 likes · 14 min read

Deploy Kimi 2.5 LLM on Alibaba Cloud with SGLang, RBG, and Openclaw

AI Engineering

Jan 30, 2026 · Artificial Intelligence

Why Letting LLMs Argue Improves Their Reasoning Quality

Google’s recent study of over 8,000 reasoning tasks shows that advanced LLMs like DeepSeek‑R1 spontaneously develop multiple internal “expert” personas that debate, and that activating a discovered “social switch” dramatically raises accuracy, revealing that engineered conflict can enhance AI reasoning.

AI debateFeature ControlLLM

0 likes · 8 min read

Why Letting LLMs Argue Improves Their Reasoning Quality

PaperAgent

Jan 30, 2026 · Artificial Intelligence

How LLM‑in‑Sandbox Turns Large Models into General‑Purpose Agents Without Extra Training

The LLM‑in‑Sandbox framework places large language models inside a virtual machine that provides external tool access, persistent storage, and code execution, yielding up to a 24.2% performance boost across six benchmark tasks without additional training, and it scales from zero‑shot to reinforcement‑learning‑enhanced agents while remaining cost‑effective.

EfficiencyLLMagentic AI

0 likes · 6 min read

How LLM‑in‑Sandbox Turns Large Models into General‑Purpose Agents Without Extra Training

Wuming AI

Jan 29, 2026 · Artificial Intelligence

How to Compress Long LLM Conversations with Smart Summarization and Sliding Window

This article explains how to keep essential information from lengthy AI chat histories by using an intelligent summarization prompt, injecting the summary as a system message, and applying a sliding‑window strategy that retains the last three exchanges, thereby reducing token cost and preserving context continuity.

CLLMPrompt Engineering

0 likes · 11 min read

How to Compress Long LLM Conversations with Smart Summarization and Sliding Window

AI Engineering

Jan 29, 2026 · Artificial Intelligence

Andrej Karpathy Says He’s Surrendered to AI Coding – A Workflow Revolution

Andrej Karpathy recounts how, within weeks, he shifted from 80% manual coding to 80% AI‑generated code, highlighting AI’s new logical flaws, its tireless persistence, expanded capabilities beyond speed, practical tips, skill erosion, and a 2026 forecast of ubiquitous AI‑produced content.

AI codingAndrej KarpathyLLM

0 likes · 7 min read

Andrej Karpathy Says He’s Surrendered to AI Coding – A Workflow Revolution

Bighead's Algorithm Notes

Jan 28, 2026 · Artificial Intelligence

How HiveMind Optimizes LLM Multi‑Agent Trading Systems via Contribution‑Guided Online Prompts

The HiveMind framework introduces a contribution‑guided online prompt optimization (CG‑OPO) that quantifies each LLM‑driven agent’s impact with Shapley values and uses a DAG‑Shapley algorithm to efficiently attribute credit, enabling real‑time adaptive optimization of multi‑agent stock‑trading systems and achieving superior returns with far fewer LLM calls.

DAG-ShapleyFinancial TradingLLM

0 likes · 15 min read

How HiveMind Optimizes LLM Multi‑Agent Trading Systems via Contribution‑Guided Online Prompts

Amap Tech

Jan 28, 2026 · Artificial Intelligence

Can Databases Teach Themselves? Exploring Agents‑Based Self‑Explaining Text‑to‑SQL

This article introduces the Agents‑Companion paradigm for Text‑to‑SQL, detailing how self‑describing database agents autonomously mine schema, statistics and semantics to generate high‑quality evidence, thereby bridging the gap between academic research and industrial deployment and significantly improving query accuracy.

AIDatabase MiningLLM

0 likes · 8 min read

Can Databases Teach Themselves? Exploring Agents‑Based Self‑Explaining Text‑to‑SQL

Ops Development Stories

Jan 28, 2026 · Artificial Intelligence

Understanding MCP, Agent, Skill, and Rule: How LLMs Differ from Traditional APIs

This article systematically explains the concepts of MCP, Agent, Skill, and Rule from an engineering viewpoint, highlighting their roles, differences from traditional API calls, and how they enable large language models to safely and autonomously interact with external tools.

AI ArchitectureLLMMCP

0 likes · 8 min read

Understanding MCP, Agent, Skill, and Rule: How LLMs Differ from Traditional APIs

PaperAgent

Jan 27, 2026 · Artificial Intelligence

How Agentic‑R Boosts Multi‑Turn Retrieval for LLMs by 2–3 EM Points

This article analyzes the Agentic‑R framework, which upgrades traditional single‑hop Retrieval‑Augmented Generation by introducing dual‑perspective scoring and a bidirectional flywheel, resulting in 2–3 absolute EM improvements across seven QA datasets and a 10–15% reduction in search rounds.

LLMMulti-hop reasoningRAG

0 likes · 6 min read

How Agentic‑R Boosts Multi‑Turn Retrieval for LLMs by 2–3 EM Points

Old Zhang's AI Learning

Jan 27, 2026 · Artificial Intelligence

DeepSeek-OCR 2 Enables AI to Read Images with Human‑Like Logical Flow

DeepSeek-OCR 2 introduces Visual Causal Flow and a LLM‑based visual encoder, achieving 91.09% accuracy on OmniDocBench v1.5, while providing detailed installation, two inference modes (vLLM and Transformers), and an analysis of its strengths and limitations for complex document processing.

DeepEncoder V2DeepSeek-OCR 2LLM

0 likes · 9 min read

DeepSeek-OCR 2 Enables AI to Read Images with Human‑Like Logical Flow

AI Tech Publishing

Jan 27, 2026 · Artificial Intelligence

Step‑by‑Step: Adding Skill Capabilities to Your Agent System

This article walks through the design patterns, three‑level loading mechanism, and practical implementation steps for integrating reusable, domain‑specific Skills into an existing Agent system, covering both local and distributed deployments with Redis‑based versioning and sandboxed execution.

LLMMeta-Tool PatternProgressive disclosure

0 likes · 14 min read

Step‑by‑Step: Adding Skill Capabilities to Your Agent System

AI Cyberspace

Jan 26, 2026 · Artificial Intelligence

How NVFP4 Quantization Supercharges LLM Inference on NVIDIA DGX

This article explains the NVFP4 4‑bit floating‑point quantization technique, shows how to deploy Qwen3‑30B‑A3B models with TensorRT‑LLM and vLLM, compares performance across NVFP4, AWQ and INT8 quantizations, and provides practical profiling commands for NVIDIA DGX systems.

LLMNVFP4NVIDIA DGX

0 likes · 23 min read

How NVFP4 Quantization Supercharges LLM Inference on NVIDIA DGX

Alibaba Cloud Developer

Jan 26, 2026 · Artificial Intelligence

How We Scaled a 3.5B MoE LLM for Real‑Time Search Relevance

This article details the engineering challenges and solutions for deploying a 3.5 billion‑parameter MoE LLM in Taobao's search relevance pipeline, covering large‑batch scheduling, dynamic load balancing, intra‑batch KV‑Cache reuse, and MoE kernel tuning to meet sub‑second latency requirements.

Inference OptimizationKV CacheLLM

0 likes · 15 min read

How We Scaled a 3.5B MoE LLM for Real‑Time Search Relevance

Fun with Large Models

Jan 25, 2026 · Artificial Intelligence

Complete Guide to Agent Skills: Core Concepts, Design Patterns, and Hands‑On Code

This article explains the three‑layer Agent Skills architecture, demonstrates step‑by‑step creation and configuration of a Skill using Claude Code—including metadata, instruction, and resource layers, advanced scripting integration, and a detailed comparison with MCP, highlighting token savings and use‑case differences.

AI AgentAgent SkillsClaude Code

0 likes · 18 min read

Complete Guide to Agent Skills: Core Concepts, Design Patterns, and Hands‑On Code

AI Frontier Lectures

Jan 25, 2026 · Artificial Intelligence

Turning Chain‑of‑Thought into Images: The Render‑of‑Thought Breakthrough

Render‑of‑Thought (RoT) proposes a novel visual‑latent reasoning framework that compresses textual chain‑of‑thought into dense image embeddings, achieving faster inference, better interpretability, and plug‑and‑play integration without costly pre‑training, as demonstrated on multiple math and logic benchmarks.

Chain-of-ThoughtImplicit CoTInference Acceleration

0 likes · 11 min read

Turning Chain‑of‑Thought into Images: The Render‑of‑Thought Breakthrough

PaperAgent

Jan 25, 2026 · Artificial Intelligence

How Deep GraphRAG Solves Retrieval’s Three‑Way Dilemma with Hierarchical Search

Deep GraphRAG tackles the three‑fold dilemma of traditional Retrieval‑Augmented Generation by introducing hierarchical global‑to‑local retrieval, a beam‑search dynamic reordering that cuts latency, and a DW‑GRPO reinforcement‑learning module that adaptively weights rewards, achieving near‑state‑of‑the‑art performance with up to 86% faster inference.

AI researchGraphRAGHierarchical Retrieval

0 likes · 5 min read

How Deep GraphRAG Solves Retrieval’s Three‑Way Dilemma with Hierarchical Search

Baobao Algorithm Notes

Jan 24, 2026 · Artificial Intelligence

What Advances Do GRPO, DAPO, GSPO, and SAPO Bring Over PPO?

After DPO, the typical research trajectory moves through GRPO, DAPO, GSPO, and SAPO, each introducing new optimization objectives, sampling strategies, and reward‑shaping techniques that aim to reduce memory usage, improve gradient stability, and enhance the efficiency of large‑model reinforcement learning.

DAPOGRPOGSPO

0 likes · 6 min read

What Advances Do GRPO, DAPO, GSPO, and SAPO Bring Over PPO?

Tech Verticals & Horizontals

Jan 23, 2026 · Artificial Intelligence

Comparing 9 Major Agent Development Frameworks: Choosing the Best Fit

This article provides an in‑depth comparison of nine mainstream AI agent development frameworks—Pydantic AI, SmolAgents, DeepAgents, LlamaIndex, CAMEL, AutoGen, CrewAI, LangGraph, and OpenAI Agents SDK—detailing their design principles, strengths, weaknesses, typical scenarios, and guidance for selecting or mixing them in production.

Agent FrameworksLLMLangChain

0 likes · 30 min read

Comparing 9 Major Agent Development Frameworks: Choosing the Best Fit

PaperAgent

Jan 23, 2026 · Artificial Intelligence

Top AAAI 2026 Papers: New Vision‑Language‑Action Model, LLM2CLIP and More

AAAI 2026 in Singapore showcased 23,680 submissions, highlighting breakthrough papers such as ReconVLA’s reconstructive vision‑language‑action model, LLM2CLIP’s language‑enhanced multimodal representation, a sheaflet‑based hypergraph neural network design, advances in description logic modeling, and a novel causal discovery method for dynamical systems.

AAAI 2026AI PapersLLM

0 likes · 7 min read

Top AAAI 2026 Papers: New Vision‑Language‑Action Model, LLM2CLIP and More

Data STUDIO

Jan 23, 2026 · Artificial Intelligence

Choosing the Best AI Agent Framework: A Practical Guide

This article explains the core AI agent loop, why dedicated frameworks are needed, compares eight popular frameworks—including RelevanceAI, smolagents, PhiData, LangChain, LlamaIndex, CrewAI, AutoGen, and LangGraph—offers selection criteria, and provides hands‑on code demos for AutoGen and LangGraph.

AI agentsAutoGenLLM

0 likes · 19 min read

Choosing the Best AI Agent Framework: A Practical Guide

Node.js Tech Stack

Jan 23, 2026 · Backend Development

Bun’s New --cpu-prof-md Flag Generates AI‑Friendly Markdown Profiling, Prompting a Node.js Response

Bun introduces the --cpu-prof-md flag that outputs CPU profiling data as structured Markdown for large language models, earning praise from Vue creator Evan You and inspiring Node.js core contributor Matteo Collina to release a pprof‑to‑md converter, highlighting a shift toward AI‑oriented CLI tools.

AI debuggingBunCLI tools

0 likes · 7 min read

Bun’s New --cpu-prof-md Flag Generates AI‑Friendly Markdown Profiling, Prompting a Node.js Response

Architecture Digest

Jan 22, 2026 · Artificial Intelligence

Unlock AI-Powered Document Search with WeKnora: A Hands‑On Guide

WeKnora is an open‑source LLM‑driven framework that transforms complex, multi‑format documents into searchable semantic knowledge, offering features such as Agent mode, hybrid retrieval, secure private deployment, and an easy‑to‑use web UI, with step‑by‑step installation instructions and demo screenshots.

AILLMOpen Source

0 likes · 7 min read

Unlock AI-Powered Document Search with WeKnora: A Hands‑On Guide

Woodpecker Software Testing

Jan 21, 2026 · Backend Development

Building a Daily News Summarizer: Design, Implementation, and Automation (Part 4)

This article walks through the complete design and implementation of a daily news summarizer, covering source selection, web‑scraping with BeautifulSoup, database schema with SQLModel, LLM‑based summarization, FastAPI endpoints, front‑end layout, category/date browsing, and a scheduled update loop.

FastAPILLMNews Summarization

0 likes · 22 min read

Building a Daily News Summarizer: Design, Implementation, and Automation (Part 4)

DeWu Technology

Jan 21, 2026 · Artificial Intelligence

Breaking the Recommendation Feedback Loop with LLM‑Powered Dynamic User Knowledge Graphs

By integrating large language models to dynamically construct user knowledge graphs and applying two‑hop reasoning, the authors enhance serendipity in a large‑scale e‑commerce community recommendation system, achieving significant online gains in diversity, novelty, and user engagement metrics.

Industrial DeploymentLLMSerendipity

0 likes · 17 min read

Breaking the Recommendation Feedback Loop with LLM‑Powered Dynamic User Knowledge Graphs

AI Frontier Lectures

Jan 21, 2026 · Artificial Intelligence

How AP2O‑Coder Cuts LLM Code Errors by Up to 3% with Adaptive Preference Optimization

The paper introduces AP2O‑Coder, an adaptive progressive preference optimization framework that systematically captures error types, progressively refines LLM code generation, and dynamically adapts training data, achieving up to a 3% pass@k improvement across multiple open‑source models while reducing data requirements.

AP2O-CoderLLMMachine Learning

0 likes · 11 min read

How AP2O‑Coder Cuts LLM Code Errors by Up to 3% with Adaptive Preference Optimization

Alibaba Cloud Infrastructure

Jan 21, 2026 · Artificial Intelligence

Boost LLM Performance: Deploy Qwen3‑235B with PD‑Separation, MoE, SGLang & RBG

This article details how to deploy the 235‑billion‑parameter Qwen3‑235B model using PD‑separation and MoE techniques, explains the associated challenges, and demonstrates a production‑grade solution built on the high‑performance SGLang inference engine and the RoleBasedGroup (RBG) orchestration framework, complete with benchmark results and best‑practice YAML examples.

AIKubernetesLLM

0 likes · 21 min read

Boost LLM Performance: Deploy Qwen3‑235B with PD‑Separation, MoE, SGLang & RBG

Data Party THU

Jan 21, 2026 · Artificial Intelligence

What DeepSeek’s Secret “Model1” Reveals About the Upcoming V4 LLM

Analyzing recent DeepSeek flashmla repository commits, the article uncovers that the mysterious Model1 likely corresponds to DeepSeek‑V4, detailing architectural shifts to a 512‑dimensional head, full support for NVIDIA Blackwell GPUs, token‑level sparse MLA, and new mechanisms such as Value Vector Position Awareness and Engram.

DeepSeekDeepSeek V4GPU optimization

0 likes · 6 min read

What DeepSeek’s Secret “Model1” Reveals About the Upcoming V4 LLM

Su San Talks Tech

Jan 21, 2026 · Artificial Intelligence

Turn PDFs into Smart Search Engines with WeKnora’s Open‑Source LLM Framework

WeKnora is an open‑source Tencent framework that leverages large language models, multimodal parsing and hybrid retrieval to let users query PDFs, Word files, images and other complex documents with natural language, offering a web UI, API and secure private‑cloud deployment options.

DockerLLMOpen Source

0 likes · 6 min read

Turn PDFs into Smart Search Engines with WeKnora’s Open‑Source LLM Framework

Java Backend Technology

Jan 21, 2026 · Artificial Intelligence

Unlock Seamless Document Search with WeKnora: An Open‑Source LLM‑Powered Retrieval Framework

WeKnora is an open‑source, LLM‑driven document understanding and semantic search framework that extracts structured content from PDFs, Word files, and images, builds a unified knowledge graph, and enables natural‑language queries through a modular RAG architecture with flexible deployment options.

AILLMRAG

0 likes · 7 min read

Unlock Seamless Document Search with WeKnora: An Open‑Source LLM‑Powered Retrieval Framework

Zhihu Tech Column

Jan 20, 2026 · Artificial Intelligence

How AI‑Powered Agentic Workflows Cut Costs and Boosted R&D Efficiency by Over 30% – A Real‑World Case Study

This article details a multi‑year, data‑driven transformation in which a product‑research team leveraged large‑model AI and agentic workflows to automate repetitive coding, streamline hot‑topic discussion creation, and replace a seven‑person outsourcing crew, achieving up to 38.6% project‑time reduction, a 22.5‑25 PD weekly capacity gain, and a dramatic drop in marginal costs.

AICost ReductionEfficiency

0 likes · 29 min read

How AI‑Powered Agentic Workflows Cut Costs and Boosted R&D Efficiency by Over 30% – A Real‑World Case Study

PaperAgent

Jan 20, 2026 · Artificial Intelligence

How Intrinsic Self‑Critique Boosts LLM Planning Accuracy to 89% %

Google DeepMind's new "Intrinsic Self‑Critique" method lets large language models iteratively self‑evaluate and rewrite their plans, raising Blocksworld planning accuracy from 49.8% to 89.3% and setting new records across multiple planning benchmarks.

AI researchLLMPlanning

0 likes · 5 min read

How Intrinsic Self‑Critique Boosts LLM Planning Accuracy to 89% %

AI Tech Publishing

Jan 20, 2026 · Artificial Intelligence

10 Core Architecture Patterns for Scalable LLM Skills and Context Engineering

The article presents a ten‑step architecture for implementing scalable LLM Skills, covering a meta‑tool pattern to avoid tool explosion, progressive three‑level loading to save tokens, script execution outside the LLM context, Redis‑based storage with pub/sub updates, version locking, dynamic addition, batch loading, and file‑system strategies.

Context EngineeringLLMMeta-Tool

0 likes · 10 min read

10 Core Architecture Patterns for Scalable LLM Skills and Context Engineering

Data Party THU

Jan 19, 2026 · Artificial Intelligence

How VersatileFFN Cuts Memory Use While Boosting LLM Performance

The article introduces Huawei's VersatileFFN, an adaptive wide‑and‑deep feed‑forward design for large language models that reuses parameters to slash memory consumption while delivering stronger inference, detailing its dual‑system inspiration, technical mechanisms, experimental gains, and implications for efficient LLM deployment.

Adaptive ComputationLLMTransformer

0 likes · 8 min read

How VersatileFFN Cuts Memory Use While Boosting LLM Performance

PaperAgent

Jan 19, 2026 · Artificial Intelligence

How Reinforcement Learning Can Boost LLM Reasoning by Shaping Token Distributions

Recent research shows that applying reinforcement learning to large language models can dramatically improve inference performance, but its effectiveness depends on the token distribution produced during pre‑training, prompting a novel rewrite of cross‑entropy as a single‑step policy gradient with controllable entropy parameters.

LLMRLToken Distribution

0 likes · 6 min read

How Reinforcement Learning Can Boost LLM Reasoning by Shaping Token Distributions

AI Engineering

Jan 18, 2026 · Artificial Intelligence

Why a Single For Loop Powers BU’s Open‑Source Agent Framework

The BU Browser Use team open‑sourced bu‑agent‑sdk, a minimal LLM agent framework that treats the agent as a simple for‑loop and adds explicit done tools, context compression, ephemeral messages, and a unified LLM interface, enabling flexible, low‑overhead AI applications.

Agent FrameworkLLMOpen Source

0 likes · 7 min read

Why a Single For Loop Powers BU’s Open‑Source Agent Framework

MaGe Linux Operations

Jan 18, 2026 · Artificial Intelligence

How to Deploy Scalable LLM Inference on Kubernetes with GPU Autoscaling

This guide walks through building a production‑grade Kubernetes GPU cluster for large language model inference, covering hardware sizing, GPU resource scheduling, model storage options, automated scaling with HPA, health checks, monitoring, troubleshooting, and multi‑model deployment strategies.

AutoscalingDockerGPU

0 likes · 49 min read

How to Deploy Scalable LLM Inference on Kubernetes with GPU Autoscaling

PaperAgent

Jan 17, 2026 · Artificial Intelligence

Hypergraphs Turn LLMs into Reliable Material Discovery Agents

This article explains how representing multi‑component scientific knowledge as hyperedges, rather than traditional triples, enables large language models to traverse complex material interactions, reduce hallucinations, and generate verifiable experimental designs, demonstrated through a large hypergraph built from thousands of scaffold papers.

AI reasoningHypergraphLLM

0 likes · 7 min read

Hypergraphs Turn LLMs into Reliable Material Discovery Agents

AI Engineering

Jan 17, 2026 · Artificial Intelligence

Can Tiny LLMs Compute Accurately? WorldModel‑Qwen Inference‑Time WASM Execution

The article details how the small Qwen‑0.6B model was adapted to generate and run WebAssembly code during inference, achieving deterministic calculations and revealing both the promise and current limitations of integrating world‑model reasoning into tiny LLMs.

LLMQwen-0.6BWASM execution

0 likes · 5 min read

Can Tiny LLMs Compute Accurately? WorldModel‑Qwen Inference‑Time WASM Execution

macrozheng

Jan 16, 2026 · Artificial Intelligence

Unlock Seamless Document Search with WeKnora: An Open‑Source LLM Retrieval Framework

WeKnora is an open‑source Tencent framework that combines large language models with retrieval‑augmented generation to enable fast, accurate semantic search and question answering across heterogeneous documents such as PDFs, Word files, and images, offering a modular, extensible architecture and easy Docker‑based deployment.

AILLMRAG

0 likes · 7 min read

Unlock Seamless Document Search with WeKnora: An Open‑Source LLM Retrieval Framework

php Courses

Jan 16, 2026 · Artificial Intelligence

From Coding to Validation: How AI Is Redefining the Developer’s Role

The rise of large language models has shifted software development from manual coding to AI‑generated drafts, making verification, security, and business alignment the core responsibilities of modern engineers, and outlining the skills, workflows, and challenges needed to thrive in this new paradigm.

AILLMcode generation

0 likes · 11 min read

From Coding to Validation: How AI Is Redefining the Developer’s Role