Claude Sonnet 4

Featured Latest

Anthropic Claude 4 Generally Available May 2025

Anthropic's mid-tier flagship — the sweet spot of intelligence and speed. 200K context, strong coding and reasoning, with optional extended thinking and prompt caching.

Context

200K

tokens

Input

$3.00

per MTok

Output

$15.00

per MTok

Model Page Try It API Docs

About

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%),...

Modalities

Input

Text Vision Documents / PDFs Code

Output

Text Code

Code Examples

curl https://api.anthropic.com/v1/messages \
  -H "x-api-key: $ANTHROPIC_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-20250514",
    "max_tokens": 1024,
    "messages": [
      { "role": "user", "content": "Explain quantum entanglement in one sentence." }
    ]
  }'

from anthropic import Anthropic

client = Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[
        {"role": "user", "content": "Explain quantum entanglement in one sentence."},
    ],
)
print(response.content[0].text)

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();
const response = await client.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1024,
  messages: [
    { role: "user", content: "Explain quantum entanglement in one sentence." },
  ],
});
console.log(response.content[0].text);

API Parameters

Name	Type	Default	Description
`include_reasoning`	boolean	—	Include the model's internal reasoning trace in the response.
`max_tokens` deprecated	integer	—	Deprecated. Use max_completion_tokens.
`reasoning`	object	—	Configuration for extended-thinking / reasoning mode.
`stop`	array	—	Up to 4 sequences where the API will stop generating tokens.
`temperature`	number	1	Sampling temperature; higher values produce more random output. 0 is deterministic.
`tool_choice`	one of	—	Controls which (if any) tool is called: "none", "auto", "required", or a specific tool.
`tools`	array	—	List of tools (functions) the model may call.
`top_k`	integer	—	Limit sampling to the top-k most likely tokens at each step.
`top_p`	number	1	Nucleus sampling: consider only tokens whose cumulative probability ≥ top_p.

Standard OpenAI-compatible parameters. Consult the provider docs for model-specific behaviour.

Benchmark Scores

Benchmark	Score	Methodology
MMLU	86.5%	5-shot
GPQA Diamond	75.4%	0-shot
SWE-bench Verified	72.7%	0-shot
MATH-500	78.2%	0-shot

Performance

tok / sec

output speed

Source: Anthropic docs + llm-stats.com, April 2026

Strengths & Limitations

Best For

200K context window

Extended thinking mode

Excellent coding (SWE-bench)

Prompt caching for long contexts

Computer use capability

Limitations

Premium output pricing

Thinking mode adds latency

No audio modality