Cline Logo
IDECLISDKSpecDrivenDocs
Sign In63k
IDECLISDKSpecDrivenDocs
PricingEnterpriseBlogLearnMCP MarketplaceFAQCareersSupportContact SalesGitHub
Sign In63k
Cline Logo
IDECLISDKSpecDrivenDocs
Sign In63k
IDECLISDKSpecDrivenDocs
PricingEnterpriseBlogLearnMCP MarketplaceFAQCareersSupportContact SalesGitHub
Sign In63k
Kimi K2-0905: Moonshot’s Latest Open-Source Model is Live in Cline
Written by
Juan Pablo Flores
Published on
September 5, 2025

Kimi K2-0905: Moonshot’s Latest Open-Source Model is Live in Cline

Three major capabilities land in Kimi K2-0905 that fundamentally change how coding agents operate: 256k context window, improved tool calling, and enhanced capabilities for front-end development. The model is live in Cline via the Groq (serving at ~349 TPS), Moonshot, OpenRouter, Fireworks, and Cline providers.

What changed from July

The July checkpoint put Kimi K2 on the map with strong tool calling and consistent diff generation (currently at 5%, which rivals Sonnet-4 at 4% and bests Gemini 2.5 Pro at 10%). K2-0905 builds on that foundation with focus on the capabilities that matter most for agent workflows.

Context window that actually scales

From 131k to 262k tokens help you work with larger codebases, conversation histories, and test suites in memory without the typical degradation at context boundaries.

The model's attention mechanism was specifically tuned for long-context scenarios. Token allocation is smarter, coherence is maintained across the full window, and you can finally stop engineering around context limits.

Performance characteristics and limitations

  • Speed: Groq delivers responses fast enough that model latency stops being a workflow bottleneck. ~349 TPS serving capacity handles production workloads without throttling. Expect some warmup time on first requests (2-3 seconds), but subsequent requests in the same session are significantly faster.
  • Context efficiency: The 256k window maintains coherence without the typical degradation you see in other long-context models. Long conversations stay focused, and the model doesn't lose track of earlier context when processing later tokens.
  • Tool reliability: Expect consistent structured outputs with ~95% or better first-try success rate on well-formed tool schemas. The model rarely produces malformed JSON or unexpected parameter variations.
  • Frontend Improvements: Moonshot has noted that K2-0905 is improved at frontend coding than its predecessor. We recommend using K2-0905 in Act mode where it can execute on the plan devised by a reasoning model.

Setup in Cline

Kimi-K2-0905 is available via the Cline, Groq, Fireworks, Vercel AI Gateway, and OpenRouter providers. It’s still $1/$3 in/out per 1M tokens through most providers, however, as an open-source model this can vary.

Related Posts

DeepSeek V3.2 and V3.2-Speciale are now available in Cline

DeepSeek V3.2 and V3.2-Speciale are now available in Cline

December 1, 2025
Anthropic's Claude Opus 4.5 is Live in Cline

Anthropic's Claude Opus 4.5 is Live in Cline

November 25, 2025
Gemini 3 Pro gives Cline more headroom for long-running coding

Gemini 3 Pro gives Cline more headroom for long-running coding

Cline Logo

Transform your engineering team with a fully collaborative AI partner. Open source, fully extensible, and built to amplify developer impact.

Stay updated on Cline's evolution

Product

DocsBlogEnterpriseMCP MarketplaceIDECLISDKChangelog

Community

DiscordRedditGitHub Discussions

Support

GitHub IssuesFeature RequestsContact

Company

CareersBrandTermsPrivacy

Stay updated on Cline's evolution

DiscordX/TwitterLinkedIn
November 18, 2025
Reddit
GitHub

© 2026 Cline Bot Inc. All rights reserved.