Gemma 3 12b IT
Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Gemma 3 models are multimodal, handling text and image input and generating text output, with a
About
Gemma 3 models are well-suited for a variety of text generation and image understanding tasks, including question answering, summarization, and reasoning. Gemma 3 models are multimodal, handling text and image input and generating text output, with a large, 128K context window, multilingual support in over 140 languages, and is available in more sizes than previous versions.
Code Examples
curl https://api.cloudflare.com/client/v4/accounts/$CLOUDFLARE_ACCOUNT_ID/ai/run/@cf/google/gemma-3-12b-it \
-H "Authorization: Bearer $CLOUDFLARE_AUTH_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Explain quantum entanglement in one sentence." }
]
}' API Parameters
Temperature: 0 – 5| Name | Type | Description |
|---|---|---|
messages required | array | An array of message objects representing the conversation history. |
prompt required | string | The input text prompt for the model to generate a response. |
frequency_penalty | number | Decreases the likelihood of the model repeating the same lines verbatim. |
functions deprecated | array | Deprecated. Use tools. |
guided_json | object | JSON schema that should be fufilled for the response. |
max_tokens | integer | The maximum number of tokens to generate in the response. |
presence_penalty | number | Increases the likelihood of the model introducing new topics. |
raw | boolean | If true, a chat template is not applied and you must adhere to the specific model's expected formatting. |
repetition_penalty | number | Penalty for repeated tokens; higher values discourage repetition. |
seed | integer | Random seed for reproducibility of the generation. |
stream | boolean | If true, the response will be streamed back incrementally using SSE, Server Sent Events. |
temperature | number | Controls the randomness of the output; higher values produce more random results. |
tools | array | A list of tools available for the assistant to use. |
top_k | integer | Limits the AI to choose from the top 'k' most probable words. Lower values make responses more focused; higher values introduce more variety and potential surprises. |
top_p | number | Adjusts the creativity of the AI's responses by controlling how many possible words it considers. Lower values make outputs more predictable; higher values allow for more varied and creative responses. |
Sourced from the model's published API schema.