Artificial Intelligence 3 min read

Anthropic Releases Claude Instant 1.2 with Improved Performance on Coding and Math Benchmarks

Anthropic announced the Claude Instant 1.2 model, an upgraded, cheaper version of its AI assistant that leverages Claude 2.0’s capabilities, achieving higher scores on Codex (58.7% vs 52.8%) and GSM8k (86.7% vs 80.9%) benchmarks, with better safety and reduced hallucinations.

php中文网 Courses
php中文网 Courses
php中文网 Courses
Anthropic Releases Claude Instant 1.2 with Improved Performance on Coding and Math Benchmarks

Since OpenAI released ChatGPT, many companies have attempted to build their own AI models, but only a few have stood out, with Anthropic being one of them.

The AI startup launched its Claude model in March, demonstrating performance comparable to OpenAI’s GPT‑3.5 and GPT‑4. At the same time, Anthropic released Claude Instant, a lighter, cheaper, and faster version, which is now being upgraded.

On Wednesday, Anthropic announced Claude Instant 1.2, an improved version that incorporates the capabilities of Claude 2.0, the latest release from July.

According to the announcement, the use of Claude 2.0’s advanced abilities gives Claude Instant 1.2 significant improvements in mathematics, coding, reasoning, and safety, and it generates longer, more structured responses.

To evaluate the model, Anthropic compared Claude Instant 1.1 and 1.2 on standard benchmark tests, including the Codex evaluation and the Grade‑school Math (GSM8k) benchmark, which are well‑known measures of coding and math proficiency.

In both cases, 1.2 scored 58.7 % on the Codex test versus 52.8 % for the original version, and 86.7 % on GSM8k versus 80.9 % for the original.

For other benchmark exams, the new model’s performance is either slightly lower or slightly higher than the previous version, with only minor differences.

The quality of the generated answers also improved, with fewer hallucinations and stronger resistance to jailbreak attempts; red‑team assessments found Claude 1.2 to be the safest model.

Enterprises can obtain the new model by filling out an interest form, while developers can access it via an API that is considerably cheaper than Claude 2.

Artificial IntelligenceBenchmarkAI modelAnthropicClaude Instant
php中文网 Courses
Written by

php中文网 Courses

php中文网's platform for the latest courses and technical articles, helping PHP learners advance quickly.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.