GPT-4o mini

OpenAI GPT-4o Generally Available Jul 2024

OpenAI's cost-efficient small model. Delivers strong multimodal performance at a fraction of GPT-4o cost — ideal for high-volume, latency-sensitive applications.

Context
128K
tokens
Input
$0.15
per MTok
Output
$0.60
per MTok

About

GPT-4o mini is OpenAI's newest model after [GPT-4 Omni](/models/openai/gpt-4o), supporting both text and image inputs with text outputs. As their most advanced small model, it is many multiples more affordable...

Modalities

Input
Text Vision Documents / PDFs Code
Output
Text Code

Code Examples

curl https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o-mini",
    "messages": [
      { "role": "user", "content": "Explain quantum entanglement in one sentence." }
    ]
  }'

API Parameters

Name Type Description
frequency_penalty number Penalize tokens by their frequency so far. Positive values reduce repetition.
logit_bias object Map of token-id to bias (-100…100) added to the logit before sampling.
logprobs boolean Return log probabilities for each output token.
max_completion_tokens integer Maximum number of tokens the model may generate in the response.
max_tokens deprecated integer Deprecated. Use max_completion_tokens.
presence_penalty number Penalize tokens that have appeared at all so far. Positive values encourage new topics.
response_format one of Constrain output to a JSON schema or an enum (structured outputs).
seed integer Deterministic seed for sampling. Same seed + same prompt produces identical output.
stop array Up to 4 sequences where the API will stop generating tokens.
structured_outputs boolean Enable JSON-schema-constrained output.
temperature number Sampling temperature; higher values produce more random output. 0 is deterministic.
tool_choice one of Controls which (if any) tool is called: "none", "auto", "required", or a specific tool.
tools array List of tools (functions) the model may call.
top_logprobs integer Return the top-N most likely tokens at each step (requires logprobs: true).
top_p number Nucleus sampling: consider only tokens whose cumulative probability ≥ top_p.
web_search_options object Configuration for web-search tool augmentation.

Standard OpenAI-compatible parameters. Consult the provider docs for model-specific behaviour.

Benchmark Scores

Benchmark Score
MMLU 82%
HumanEval 87.2%
MATH 70.2%

Performance

150
tok / sec
output speed

Source: OpenAI docs + Artificial Analysis, April 2026

Strengths & Limitations

Best For
Extremely low cost
Fast response times
Vision capable
Great for high-volume workloads
Limitations
Weaker reasoning than GPT-4o
Smaller output token limit
No audio output

Tags

FastCheapVisionMultimodalTool Use