Cline Logo
TeamsMCP ServersBlogCareersDocs
Account
TeamsMCP ServersBlogCareersDocs
Account
Cline Logo
TeamsMCP ServersBlogCareersDocs
Account
TeamsMCP ServersBlogCareersDocs
Account
Kimi K2-0905: Moonshot’s Latest Open-Source Model is Live in Cline

Kimi K2-0905: Moonshot’s Latest Open-Source Model is Live in Cline

Kimi K2-0905 doubles the context window to 256k tokens and delivers some of the highest reliability tool calling we've seen in an open-source model. Built for coding agents that actually work.

Juan Pablo Flores
September 5, 2025

Three major capabilities land in Kimi K2-0905 that fundamentally change how coding agents operate: 256k context window, improved tool calling, and enhanced capabilities for front-end development. The model is live in Cline via the Groq (serving at ~349 TPS), Moonshot, OpenRouter, Fireworks, and Cline providers.

What changed from July

The July checkpoint put Kimi K2 on the map with strong tool calling and consistent diff generation (currently at 5%, which rivals Sonnet-4 at 4% and bests Gemini 2.5 Pro at 10%). K2-0905 builds on that foundation with focus on the capabilities that matter most for agent workflows.

Context window that actually scales

From 131k to 262k tokens help you work with larger codebases, conversation histories, and test suites in memory without the typical degradation at context boundaries.

The model's attention mechanism was specifically tuned for long-context scenarios. Token allocation is smarter, coherence is maintained across the full window, and you can finally stop engineering around context limits.

Performance characteristics and limitations

  • Speed: Groq delivers responses fast enough that model latency stops being a workflow bottleneck. ~349 TPS serving capacity handles production workloads without throttling. Expect some warmup time on first requests (2-3 seconds), but subsequent requests in the same session are significantly faster.
  • Context efficiency: The 256k window maintains coherence without the typical degradation you see in other long-context models. Long conversations stay focused, and the model doesn't lose track of earlier context when processing later tokens.
  • Tool reliability: Expect consistent structured outputs with ~95% or better first-try success rate on well-formed tool schemas. The model rarely produces malformed JSON or unexpected parameter variations.
  • Frontend Improvements: Moonshot has noted that K2-0905 is improved at frontend coding than its predecessor. We recommend using K2-0905 in Act mode where it can execute on the plan devised by a reasoning model.

Setup in Cline

Kimi-K2-0905 is available via the Cline, Groq, Fireworks, Vercel AI Gateway, and OpenRouter providers. It’s still $1/$3 in/out per 1M tokens through most providers, however, as an open-source model this can vary.

Related Posts

Cline v3.26.6: Three Ways to Code for Free

Cline v3.26.6: Three Ways to Code for Free

August 28, 2025
Grok Code Fast 1: xAI's Latest Model Lands in Cline (Free for a Week)

Grok Code Fast 1: xAI's Latest Model Lands in Cline (Free for a Week)

August 28, 2025
Cline v3.25: The Coding Agent Built for Hard Problems

Cline v3.25: The Coding Agent Built for Hard Problems

August 15, 2025
Cline Logo

Transform your engineering team with a fully collaborative AI partner. Open source, fully extensible, and built to amplify developer impact.

Stay updated on Cline's evolution

Product

DocsBlogEnterpriseMCP MarketplaceChangelog

Community

DiscordRedditGitHub Discussions

Support

GitHub IssuesFeature RequestsContact

Company

CareersTermsPrivacy

Stay updated on Cline's evolution

DiscordX/TwitterLinkedInReddit

© 2025 Cline Bot Inc. All rights reserved.