GPT-4o

Featured Latest

OpenAI GPT-4o Generally Available May 2024

OpenAI's flagship omni-modal model. Fast, multimodal (text, vision, audio), 128K context, and strong across reasoning, coding, and instruction following.

Context

128K

tokens

Input

$2.50

per MTok

Output

$10.00

per MTok

Model Page Try It API Docs

About

GPT-4o ("o" for "omni") is OpenAI's latest AI model, supporting both text and image inputs with text outputs. It maintains the intelligence level of [GPT-4 Turbo](/models/openai/gpt-4-turbo) while being twice as...

Modalities

Input

Text Vision Audio Documents / PDFs Code

Output

Text Code Audio

Code Examples

curl https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [
      { "role": "user", "content": "Explain quantum entanglement in one sentence." }
    ]
  }'

from openai import OpenAI

client = OpenAI(
    base_url="https://api.openai.com/v1",
    api_key=os.environ["OPENAI_API_KEY"],
)
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Explain quantum entanglement in one sentence."},
    ],
)
print(response.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.openai.com/v1",
  apiKey: process.env.OPENAI_API_KEY,
});

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [
    { role: "user", content: "Explain quantum entanglement in one sentence." },
  ],
});
console.log(response.choices[0].message.content);

API Parameters

Name	Type	Default	Description
`frequency_penalty`	number	0	Penalize tokens by their frequency so far. Positive values reduce repetition.
`logit_bias`	object	—	Map of token-id to bias (-100…100) added to the logit before sampling.
`logprobs`	boolean	false	Return log probabilities for each output token.
`max_completion_tokens`	integer	—	Maximum number of tokens the model may generate in the response.
`max_tokens` deprecated	integer	—	Deprecated. Use max_completion_tokens.
`presence_penalty`	number	0	Penalize tokens that have appeared at all so far. Positive values encourage new topics.
`response_format`	one of	—	Constrain output to a JSON schema or an enum (structured outputs).
`seed`	integer	—	Deterministic seed for sampling. Same seed + same prompt produces identical output.
`stop`	array	—	Up to 4 sequences where the API will stop generating tokens.
`structured_outputs`	boolean	—	Enable JSON-schema-constrained output.
`temperature`	number	1	Sampling temperature; higher values produce more random output. 0 is deterministic.
`tool_choice`	one of	—	Controls which (if any) tool is called: "none", "auto", "required", or a specific tool.
`tools`	array	—	List of tools (functions) the model may call.
`top_logprobs`	integer	—	Return the top-N most likely tokens at each step (requires logprobs: true).
`top_p`	number	1	Nucleus sampling: consider only tokens whose cumulative probability ≥ top_p.
`web_search_options`	object	—	Configuration for web-search tool augmentation.

Standard OpenAI-compatible parameters. Consult the provider docs for model-specific behaviour.

Benchmark Scores

Benchmark	Score	Methodology
MMLU	88.7%	5-shot
HumanEval	90.2%	0-shot
MATH	76.6%	4-shot
GPQA Diamond	53.6%	0-shot

Performance

110

tok / sec

output speed

Source: Artificial Analysis, April 2026

Strengths & Limitations

Best For

Multimodal (text, vision, audio)

Strong instruction following

Broad benchmark coverage

Fast API response times

Limitations

128K context smaller than some competitors

Proprietary — no local deployment

Premium pricing for high-volume use