Mistral Small 3.2

Latest
Mistral Mistral Small Generally Available Jun 2025

Mistral's refined 24B model with 128K context, vision support, and Apache 2.0 licence. Beats Mixtral 8x7B on most benchmarks at much lower cost — 1M+ HuggingFace downloads.

Context
131K
tokens
Input
$0.10
per MTok
Output
$0.30
per MTok

Modalities

Input
Text Vision Code
Output
Text Code

Code Examples

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "mistralai/Mistral-Small-3.2-24B-Instruct-2506",
    "messages": [
      { "role": "user", "content": "Explain quantum entanglement in one sentence." }
    ]
  }'

API Parameters

Name Type Description
frequency_penalty number Penalize tokens by their frequency so far. Positive values reduce repetition.
max_completion_tokens integer Maximum number of tokens the model may generate in the response.
presence_penalty number Penalize tokens that have appeared at all so far. Positive values encourage new topics.
response_format one of Constrain output to a JSON schema or an enum (structured outputs).
seed integer Deterministic seed for sampling. Same seed + same prompt produces identical output.
stop array Up to 4 sequences where the API will stop generating tokens.
stream boolean Stream partial responses as Server-Sent Events.
temperature number Sampling temperature; higher values produce more random output. 0 is deterministic.
tool_choice one of Controls which (if any) tool is called: "none", "auto", "required", or a specific tool.
tools array List of tools (functions) the model may call.
top_p number Nucleus sampling: consider only tokens whose cumulative probability ≥ top_p.

Standard OpenAI-compatible parameters. Consult the provider docs for model-specific behaviour.

Benchmark Scores

Benchmark Score
MMLU 82%
HumanEval 88.99%
MATH 70%

Performance

120
tok / sec
output speed

Source: Mistral AI docs + llm-stats.com, April 2026

Strengths & Limitations

Best For
Vision capable (Mistral Small 3.1+)
128K context
Apache 2.0 — fully open
Multilingual (24 languages)
Excellent cost efficiency
Limitations
24B — smaller than frontier models
No extended thinking

Tags

Open WeightsVisionFastEfficientMultilingual