Tagged articles

16 articles

Page 1 of 1

May 24, 2026 · Artificial Intelligence

2026 AI Coding Agent Benchmark: Cursor, Claude Code, and Codex – Who Leads?

A comprehensive 2026 benchmark evaluates major AI coding agents—Cursor CLI, Claude Code, OpenAI Codex, and Google Gemini—across performance, token consumption, cost per task, and execution time, revealing a tight top‑three score margin and highlighting cost‑efficiency and latency as the new competitive frontiers.

AI coding agentsClaude CodeCost

0 likes · 6 min read

2026 AI Coding Agent Benchmark: Cursor, Claude Code, and Codex – Who Leads?

Old Zhang's AI Learning

May 24, 2026 · Industry Insights

How a Fake vLLM PR Exposed the Risks of AI‑Generated Resume Padding

The article dissects a fabricated vLLM pull request that pretended to fix a non‑existent NVIDIA Eagle3 checkpoint bug, explains its bogus test plan, shows how AI‑assisted PR generation can flood open‑source projects, and warns of the trust damage such resume‑padding schemes cause.

AI coding agentsEagle3NVIDIA

0 likes · 7 min read

How a Fake vLLM PR Exposed the Risks of AI‑Generated Resume Padding

Java Backend Technology

May 20, 2026 · Artificial Intelligence

Claude Code vs Codex: 10× Cost, 4× Speed – A Deep Comparative Review

The article provides a data‑driven comparison between Anthropic's Claude Code and OpenAI's Codex, covering benchmark scores (SWE‑bench, Terminal‑Bench), blind‑test code‑quality results, token consumption, real‑world cost scenarios, ecosystem integration (MCP), and community feedback to help teams choose the right AI coding agent for their workflow.

AI coding agentsClaude CodeCodex

0 likes · 14 min read

Claude Code vs Codex: 10× Cost, 4× Speed – A Deep Comparative Review

BirdNest Tech Talk

May 18, 2026 · Artificial Intelligence

Taming AI Coding Agents: A Powerful Development Workflow with Engineering Discipline

The article introduces Matt Pocock's open‑source "skills" collection for AI coding agents, shows how it embeds traditional engineering practices such as alignment, domain modeling, TDD, and architecture governance into reusable command sets, and walks through a complete partial‑refund feature implementation using these skills.

AI coding agentsTDDarchitecture governance

0 likes · 22 min read

Taming AI Coding Agents: A Powerful Development Workflow with Engineering Discipline

AI Architecture Hub

May 2, 2026 · Artificial Intelligence

Building a Multi‑Agent Coding Stack: Practical Tips, Real‑World Tests, and Cost Savings

The author compares Claude Code, Cursor, and GPT‑based agents, discovers the open‑source Kimi K2.6 model, installs it in minutes, runs three realistic coding tasks, and shows that a mixed‑agent workflow can cut token costs by up to 85% while maintaining comparable quality.

AI coding agentsAgent SwarmClaude Code

0 likes · 13 min read

Building a Multi‑Agent Coding Stack: Practical Tips, Real‑World Tests, and Cost Savings

AI Open-Source Efficiency Guide

Apr 29, 2026 · Backend Development

How Sentrux Turns AI‑Generated Code into Controlled Architecture Evolution

Sentrux, a Rust‑based real‑time architecture sensor, visualizes a project’s dependency graph as an interactive treemap, scores code health on five metrics, and integrates with AI coding agents via MCP to provide millisecond‑level feedback, enabling continuous quality gating and preventing architectural decay caused by AI‑driven code generation.

AI coding agentsMCP integrationRust

0 likes · 9 min read

How Sentrux Turns AI‑Generated Code into Controlled Architecture Evolution

Code Mala Tang

Apr 21, 2026 · Artificial Intelligence

Turn a Simple AGENTS.md into a Senior Engineer’s Playbook for AI Coding Assistants

AGENTS.md is a concise, project‑root file that guides AI coding assistants like Claude Code, Codex, and Cursor to behave like senior engineers by enforcing non‑negotiable rules, minimal changes, verification‑first execution, and clear communication, all distilled from Karpathy’s failure principles and Boris Cherny’s workflow.

AI coding agentsLLM best practicesagentic AI

0 likes · 22 min read

Turn a Simple AGENTS.md into a Senior Engineer’s Playbook for AI Coding Assistants

Machine Heart

Apr 18, 2026 · Artificial Intelligence

Can Claude Code’s Auto Mode Replace Human Review? First Pressure Test Results

A systematic pressure test of Claude Code’s Auto Mode across 128 ambiguous DevOps permission scenarios reveals an 81% false‑negative rate, shows that many risky state‑changing actions bypass the classifier via Tier‑2 file edits, and highlights heuristic biases tied to blast radius and risk level.

AI coding agentsAuto modeClaude Code

0 likes · 10 min read

Can Claude Code’s Auto Mode Replace Human Review? First Pressure Test Results

MeowKitty Programming

Apr 9, 2026 · Industry Insights

When AI Takes Requirements, Runs Tests, and Submits PRs, Programmers’ Job Descriptions Change

The article analyzes how AI coding agents are moving from answering questions to autonomously handling the entire development workflow, reshaping programmers' roles from manual implementation to defining, orchestrating, and validating tasks.

AI coding agentsAutomationagentic AI

0 likes · 8 min read

When AI Takes Requirements, Runs Tests, and Submits PRs, Programmers’ Job Descriptions Change

Design Hub

Mar 31, 2026 · Industry Insights

Four Minor AI News Items Reveal the Shift from Model Competition to Workflow Dominance

The article examines four recent AI coding tool events—a source‑map leak, a computer‑use preview, an OpenAI plugin, and an Apple AI mis‑push—to argue that the AI race is moving from pure model superiority toward competition over workflows, interfaces, and system‑level integration.

AI coding agentsClaude CodeComputer Use

0 likes · 13 min read

Four Minor AI News Items Reveal the Shift from Model Competition to Workflow Dominance

ArcThink

Mar 29, 2026 · Artificial Intelligence

Claude Code vs Codex: Deep Technical Architecture, Performance, and Real‑World Experience

This article provides a comprehensive, data‑driven comparison of Anthropic's Claude Code and OpenAI's Codex CLI, covering their divergent architectures, token efficiency, benchmark results, pricing models, and developer community feedback to help engineers choose the tool that best fits their workflow.

AI coding agentsClaude CodeCodex CLI

0 likes · 22 min read

Claude Code vs Codex: Deep Technical Architecture, Performance, and Real‑World Experience

AI Engineering

Mar 22, 2026 · R&D Management

When Code Is Free, How Engineers Stay Valuable – Simon’s Engineering Patterns

The guide reveals that while AI agents have reduced code generation costs to near zero, the true expense lies in ensuring quality, requiring engineers to shift from writing code to defining problems, designing agentic systems, and applying rigorous testing patterns such as red‑green TDD, context‑managed sub‑agents, and advanced Git workflows.

AI coding agentsCognitive DebtGit

0 likes · 10 min read

When Code Is Free, How Engineers Stay Valuable – Simon’s Engineering Patterns

Shi's AI Notebook

Mar 15, 2026 · Artificial Intelligence

How We Built a Full‑Scale Product Using Only Codex‑Generated Code

Over five months the team created an internally used product from an empty Git repository, writing every line of application logic, tests, CI configuration, documentation and tooling with OpenAI's Codex, achieving roughly one‑tenth the effort of manual coding while uncovering new engineering roles and processes.

AI coding agentsCodexObservability

0 likes · 20 min read

How We Built a Full‑Scale Product Using Only Codex‑Generated Code

AI Engineering

Jan 29, 2026 · Artificial Intelligence

How a Tiny AGENTS.md Change Boosted AI Coding Accuracy from 53% to 100%

A Vercel team experiment shows that replacing the Skills approach with a small 8 KB AGENTS.md file raised AI coding agents' pass rate from 53% to a perfect 100%, revealing the fragility of explicit tool calls and the strength of passive, always‑available context.

AGENTS.mdAI coding agentsNext.js

0 likes · 11 min read

How a Tiny AGENTS.md Change Boosted AI Coding Accuracy from 53% to 100%

21CTO

Jan 16, 2026 · Information Security

Do AI Coding Agents Introduce Critical Security Flaws? Insights from a Vibe Study

A Tenzai research team evaluated five popular AI coding agents on three Vibe‑generated applications, uncovering comparable bug counts but severe vulnerabilities in Claude, Devin, and Codex outputs, highlighting systemic authorization flaws and the risks of low‑code AI development.

AI coding agentsAI safetySecurity Vulnerabilities

0 likes · 5 min read

Do AI Coding Agents Introduce Critical Security Flaws? Insights from a Vibe Study

Java Tech Enthusiast

Jan 12, 2026 · Artificial Intelligence

Can Claude Code Build a Year‑Long System in Just One Hour?

A Google senior engineer reports that Anthropic's Claude Code reproduced a system her team spent a year developing within an hour, sparking debate over AI coding agents, productivity gains, and the future of software engineering.

AI coding agentsAnthropicClaude Code

0 likes · 11 min read

Can Claude Code Build a Year‑Long System in Just One Hour?