MoonshotAI: Kimi K2 Thinking

Kimi K2 Thinking is Moonshot AI’s most advanced open reasoning model to date, extending the K2 series into agentic, long-horizon reasoning. Built on the trillion-parameter Mixture-of-Experts (MoE) architecture introduced in Kimi K2, it activates 32 billion parameters per forward pass and supports 256 k-token context windows. The model is optimized for persistent step-by-step thought, dynamic tool invocation, and complex reasoning workflows that span hundreds of turns. It interleaves step-by-step reasoning with tool use, enabling autonomous research, coding, and writing that can persist for hundreds of sequential actions without drift. It sets new open-source benchmarks on HLE, BrowseComp, SWE-Multilingual, and LiveCodeBench, while maintaining stable multi-agent behavior through 200–300 tool calls. Built on a large-scale MoE architecture with MuonClip optimization, it combines strong reasoning depth with high inference efficiency for demanding agentic and analytical tasks.

Context Length:262K tokens

Pricing:$0.55M

Created:November 6, 2025

Model Information

Context Length

262K

Created

November 6, 2025

TokenizerOther

Modalitytext->text

Input Modalities

text

Output Modalities

text

Pricing Information

Prompt

$0.55M

per 1M tokens

Completion

$2.25M

per 1M tokens

Supported Parameters

frequency_penaltyinclude_reasoninglogit_biaslogprobsmax_tokensmin_ppresence_penaltyreasoningrepetition_penaltyresponse_formatseedstopstructured_outputstemperaturetool_choicetoolstop_ktop_logprobstop_p

Common Use Cases

Text Generation

• Content writing and editing
• Code generation and debugging
• Creative writing and storytelling
• Translation and summarization

General Applications

• Chatbots and virtual assistants
• Educational content creation
• Research and analysis
• Automation and workflow

Frequently Asked Questions

What is the context length of this model?

This model has a context length of 262K tokens, which means it can process and remember up to 262K tokens of text in a single conversation or request.

How much does it cost to use this model?

Prompt tokens cost $0.55M/1M tokens and completion tokens cost $2.25M/1M tokens.

What modalities does this model support?

This model supports text->text modality, accepting textas input and producing text as output.

When was this model created?

This model was created on November 6, 2025.