Cursor Returns with Composer 2.5: Openly Built on Kimi, 10× Lower Cost, Musk Endorses
Cursor unveiled Composer 2.5, reporting benchmark scores comparable to Opus 4.7 and GPT‑5.5, a ten‑fold cost reduction, explicit use of Moonshot’s Kimi K2.5 as a base, new RL training techniques, and a partnership with SpaceXAI that multiplies compute power, all highlighted by Elon Musk’s retweet.
Hello, I’m Zhijian Jun!
Cursor suddenly released Composer 2.5, branding it as the "most powerful model to date" and noting 11.82 million tweet views with a retweet from Elon Musk. Beyond the model upgrade, Cursor announced a larger development effort.
First, the data
Three main benchmarks show Composer 2.5 results:
Terminal‑Bench 2.0 : 69.3%, essentially tied with Opus 4.7 (69.4%).
SWE‑Bench Multilingual : 79.8%, surpassing GPT‑5.5 (77.8%) and just below Opus 4.7 (80.5%).
CursorBench v3.1 (hard tasks) : 63.2%, beating Opus 4.7 (61.6%) and GPT‑5.5 (59.2%).
CursorBench, a test suite designed by Cursor to reflect real‑world programming, shows Composer 2.5 outperforming Opus 4.7 and GPT‑5.5 at the same default configurations.
Although these scores are not the absolute top among frontier models, the cost‑efficiency chart tells a different story.
The cost scatter plot shows Opus 4.7 requiring $7‑10 per task to reach similar CursorBench scores, GPT‑5.5 needing $1‑2, while Composer 2.5 sits in the upper‑right corner with comparable scores and near‑zero cost. The official claim is "10× more efficient than similarly capable models." Pricing is $0.50 per million input tokens and $2.50 per million output tokens, with usage limits doubled in the first week.
Cursor now discloses the base model
When Composer 2 was released, the community discovered the underlying model was Kimi K2.5, sparking a transparency controversy. Product lead Lee Robinson promised to clarify the base model for the next release, and this time they delivered.
Cursor’s announcement states: Composer 2.5 is built on the same open‑source foundation as Composer 2, namely Moonshot’s Kimi K2.5. However, a follow‑up resource‑distribution chart reveals that 85% of the compute for Composer 2.5 comes from Cursor’s own additional training and reinforcement learning, with Kimi K2 and Kimi K2.5 each contributing only 7.5%.
In other words, Kimi K2.5 serves as a starting point; the bulk of the model’s capabilities stem from Cursor’s extensive fine‑tuning and RL work, a scale beyond simple open‑source model fine‑tuning.
What they actually did
Cursor disclosed three core technical directions:
Scaling up training size, generating reinforcement‑learning environments far more complex than the previous generation.
Introducing a text‑feedback mechanism that precisely allocates points across rollouts spanning hundreds of thousands of tokens, accelerating learning.
Increasing synthetic data volume by 25× compared with the prior model, and observing the model discover advanced tricks such as cache parsing and bytecode decompilation.
On the optimizer side, they combined sharded Muon with a double‑grid HSDP, achieving a 0.2‑second optimizer step at trillion‑parameter scale—engineering details typical of model‑centric companies rather than end‑user applications.
The official quote reads: "Composer 2.5 exceptionally intelligent and up to 10x more efficient than similarly capable models."
Then the bigger announcement
Cursor announced a collaboration with SpaceXAI to train a brand‑new, much larger model from scratch, using ten times the current compute power. The plan involves the Colossus 2 system with a million H100‑equivalent GPUs, leveraging both parties’ data and training expertise.
When SpaceXAI and Cursor first announced the compute partnership in April, many assumed it was merely a commercial rental agreement. The current disclosure shows the partnership has progressed to joint zero‑shot training of a new model.
Elon Musk retweeted the announcement with the comment, "Give it a try! (partially trained on Colossus 2)." The brief endorsement carries significant weight.
CEO Michael Truell wrote that Composer 2.5 is a significant step up from Composer 2 and that this collaboration with SpaceXAI is just the beginning, with more improvements to follow.
Notable signals
Cursor’s trajectory is becoming clearer: it is moving toward proprietary data and RL‑trained programming models rather than relying indefinitely on third‑party APIs. Each generation—from Composer 1 to 1.5, 2, and now 2.5—pushes self‑research capability forward.
The new compute backing from SpaceXAI will raise the scale and ceiling of the next model; a million H100‑equivalent GPUs is not a trivial amount.
Explicitly naming Kimi K2.5 as the open‑source starting point reflects a shift toward a more standardized collaboration model between open‑source ecosystems and commercial products, where the differentiation comes from how each party builds on the base.
Cost efficiency is a standalone highlight: achieving comparable performance at a ten‑fold cost advantage dramatically influences which high‑frequency, agent‑driven programming workloads can be scaled.
While the next‑generation model is still in training and its real‑world impact remains to be seen, the current baseline appears sturdier than expected.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
