Kimi K2 6
Other Generally Available
Kimi K2.6 is a frontier-scale open-source 1T parameter model with a 262.1k context window, multi-turn tool calling, vision inputs, and structured outputs for agentic workloads.
Context
262K
tokens
Input
$0.95
per MTok
Output
$4.00
per MTok
About
Kimi K2.6 is a frontier-scale open-source 1T parameter model with a 262.1k context window, multi-turn tool calling, vision inputs, and structured outputs for agentic workloads.
Advanced Capabilities
Multi-turn Tool Calling
Chained tool calls in one session
Vision Input
Accepts image inputs
Code Examples
curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run/@cf/moonshotai/kimi-k2.6 \
-H "Authorization: Bearer $CLOUDFLARE_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Explain quantum entanglement in one sentence." }
]
}' API Parameters
| Name | Type | Description |
|---|---|---|
messages required | array | A list of messages comprising the conversation so far. |
prompt required | string | The input text prompt for the model to generate a response. |
audio | one of | Audio-output configuration (voice + format) when modalities includes "audio". |
chat_template_kwargs | object | Provider-specific keyword arguments for the chat template. |
frequency_penalty | one of | Penalizes new tokens based on their existing frequency in the text so far. |
function_call deprecated | one of | Deprecated. Use tool_choice. |
functions deprecated | array | Deprecated. Use tools. |
logit_bias | one of | Modify the likelihood of specified tokens appearing in the completion. Maps token IDs to bias values from -100 to 100. |
logprobs | one of | Whether to return log probabilities of the output tokens. |
max_completion_tokens | one of | An upper bound for the number of tokens that can be generated for a completion. |
max_tokens deprecated | one of | Deprecated in favor of max_completion_tokens. The maximum number of tokens to generate. |
metadata | one of | Set of 16 key-value pairs that can be attached to the object. |
modalities | one of | Output types requested from the model (e.g. ['text'] or ['text', 'audio']). |
model | string | ID of the model to use (e.g. '@cf/zai-org/glm-4.7-flash, etc'). |
n | one of | How many chat completion choices to generate for each input message. |
parallel_tool_calls | boolean | Whether to enable parallel function calling during tool use. |
prediction | one of | Predicted output content for accelerated decoding. |
presence_penalty | one of | Penalizes new tokens based on whether they appear in the text so far. |
reasoning_effort | one of | Constrains effort on reasoning for reasoning models (o1, o3-mini, etc.). |
requests | array | — |
response_format | one of | Constrain output to a JSON schema or an enum (structured outputs). |
seed | one of | If specified, the system will make a best effort to sample deterministically. |
service_tier | one of | Specifies the processing type used for serving the request. |
stop | one of | Up to 4 sequences where the API will stop generating further tokens. |
store | one of | Whether to store the output for model distillation / evals. |
stream | one of | If true, partial message deltas will be sent as server-sent events. |
stream_options | one of | Options for the streaming response (e.g. include_usage). |
temperature | one of | Sampling temperature between 0 and 2. |
tool_choice | one of | Controls which (if any) tool is called: "none", "auto", "required", or a specific tool. |
tools | array | A list of tools the model may call. |
top_logprobs | one of | How many top log probabilities to return at each token position (0-20). Requires logprobs=true. |
top_p | one of | Nucleus sampling: considers the results of the tokens with top_p probability mass. |
user | string | A unique identifier representing your end-user, for abuse monitoring. |
web_search_options | one of | Configuration for web-search tool augmentation. |
Sourced from the model's published API schema.