Artificial Intelligence 8 min read

Claude Sonnet 4.5 Launch: 30‑Hour Continuous Coding and Major Capability Boost

Anthropic's Claude Sonnet 4.5 arrives as the strongest coding model yet, delivering over 30 hours of uninterrupted programming, major gains in reasoning, math and agent tasks, safety‑aligned training, new API features, benchmark‑leading performance, and pricing identical to Sonnet 4.

Software Engineering 3.0 Era

Sep 30, 2025

Claude Sonnet 4.5 Launch: 30‑Hour Continuous Coding and Major Capability Boost

Anthropic officially released Claude Sonnet 4.5 in the early hours of Beijing time, positioning it as the world’s most powerful code model and highlighting significant breakthroughs in agent construction, computer usage, reasoning, and mathematical abilities.

New Features and Product Upgrades

Claude Code adds a highly requested Checkpoints function for saving progress and rolling back, a refreshed terminal UI, and a native VS Code plugin.

Claude API introduces context‑editing and a memory tool that enable agents to run longer and handle more complex tasks.

Claude apps now support direct code execution and generation of files such as spreadsheets, slides, and documents.

Claude for Chrome extension is opened to Max users on the waiting list.

Front‑Running Performance and Benchmarks

SWE‑bench Verified : achieves the latest optimal level on real‑software coding tests, maintaining focused execution for more than 30 hours on multi‑step tasks.

OSWorld : scores 61.4 %—the top result—up from 42.2 % for Sonnet 4 just four months earlier.

In public evaluations of reasoning and mathematics, Sonnet 4.5 leads across finance, law, medicine, and STEM domains, far surpassing the previous Opus 4.1.

Stronger Alignment and Safety

Sonnet 4.5 is the most aligned Claude model to date, incorporating extensive safety training that markedly reduces harmful behaviors such as gaming, deception, power‑seeking, and false encouragement. It follows the AI Safety Level 3 (ASL‑3) framework, adding classifiers that filter chemical, biological, radiological, and nuclear content. Although occasional false positives occur, the false‑positive rate is ten times lower than that of Sonnet 4.

Pricing and Availability

Input: $3 per million tokens

Output: $15 per million tokens

Sonnet 4.5 can be accessed via multiple channels:

Claude API – model name claude-sonnet-4-5-20250929 Amazon Bedrock – anthropic.claude-sonnet-4-5-20250929-v1:0 Google Cloud Vertex AI – claude-sonnet-4-5@20250929 Claude.ai and Claude Code platforms

Upgrade Guide

Developers currently using Sonnet 4 only need to replace the model identifier with claude-sonnet-4-5-20250929 to migrate. All existing API calls remain functional, and enabling the new memory tool and context‑cleaning features is recommended to fully exploit the model’s performance. Note that the temperature and top_p parameters can no longer be specified together; users must choose one.

Research Preview and SDK

The “Imagine with Claude” preview lets Max subscribers watch Claude generate software in real time without pre‑written code. The newly released Claude Agent SDK provides the same underlying infrastructure used by Claude Code, enabling developers to build autonomous agents.

Outlook

With its comprehensive upgrades in coding, agent capabilities, computer interaction, and safety, Claude Sonnet 4.5 is set to become the benchmark model for the next wave of AI‑driven programming competitions.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI coding API benchmark safety Claude Sonnet 4.5

Written by

Software Engineering 3.0 Era

With large models (LLMs) reshaping countless industries, software engineering is leading the charge into the Software Engineering 3.0 era—model-driven development and operations. This account focuses on the new paradigms, theories, and methods of SE 3.0, and showcases its tools and practices.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.