Qwq 32b
QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems. QwQ-32B
About
QwQ is the reasoning model of the Qwen series. Compared with conventional instruction-tuned models, QwQ, which is capable of thinking and reasoning, can achieve significantly enhanced performance in downstream tasks, especially hard problems. QwQ-32B is the medium-sized reasoning model, which is capable of achieving competitive performance against state-of-the-art reasoning models, e.g., DeepSeek-R1, o1-mini.
Code Examples
curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run/@cf/qwen/qwq-32b \
-H "Authorization: Bearer $CLOUDFLARE_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Explain quantum entanglement in one sentence." }
]
}' API Parameters
Temperature: 0 – 5| Name | Type | Description |
|---|---|---|
messages required | array | An array of message objects representing the conversation history. |
prompt required | string | The input text prompt for the model to generate a response. |
frequency_penalty | number | Decreases the likelihood of the model repeating the same lines verbatim. |
functions deprecated | array | Deprecated. Use tools. |
guided_json | object | JSON schema that should be fulfilled for the response. |
max_tokens | integer | The maximum number of tokens to generate in the response. |
presence_penalty | number | Increases the likelihood of the model introducing new topics. |
raw | boolean | If true, a chat template is not applied and you must adhere to the specific model's expected formatting. |
repetition_penalty | number | Penalty for repeated tokens; higher values discourage repetition. |
seed | integer | Random seed for reproducibility of the generation. |
stream | boolean | If true, the response will be streamed back incrementally using SSE, Server Sent Events. |
temperature | number | Controls the randomness of the output; higher values produce more random results. |
tools | array | A list of tools available for the assistant to use. |
top_k | integer | Limits the AI to choose from the top 'k' most probable words. Lower values make responses more focused; higher values introduce more variety and potential surprises. |
top_p | number | Adjusts the creativity of the AI's responses by controlling how many possible words it considers. Lower values make outputs more predictable; higher values allow for more varied and creative responses. |
Sourced from the model's published API schema.