Claude Sonnet 4
Featured Latest Anthropic Claude 4 Generally Available May 2025
Anthropic's mid-tier flagship — the sweet spot of intelligence and speed. 200K context, strong coding and reasoning, with optional extended thinking and prompt caching.
Context
200K
tokens
Input
$3.00
per MTok
Output
$15.00
per MTok
About
Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%),...
Modalities
Input
Text Vision Documents / PDFs Code
Output
Text Code
Code Examples
curl https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "anthropic-version: 2023-06-01" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-20250514",
"max_tokens": 1024,
"messages": [
{ "role": "user", "content": "Explain quantum entanglement in one sentence." }
]
}' API Parameters
| Name | Type | Description |
|---|---|---|
include_reasoning | boolean | Include the model's internal reasoning trace in the response. |
max_tokens deprecated | integer | Deprecated. Use max_completion_tokens. |
reasoning | object | Configuration for extended-thinking / reasoning mode. |
stop | array | Up to 4 sequences where the API will stop generating tokens. |
temperature | number | Sampling temperature; higher values produce more random output. 0 is deterministic. |
tool_choice | one of | Controls which (if any) tool is called: "none", "auto", "required", or a specific tool. |
tools | array | List of tools (functions) the model may call. |
top_k | integer | Limit sampling to the top-k most likely tokens at each step. |
top_p | number | Nucleus sampling: consider only tokens whose cumulative probability ≥ top_p. |
Standard OpenAI-compatible parameters. Consult the provider docs for model-specific behaviour.
Benchmark Scores
| Benchmark | Score |
|---|---|
| MMLU | 86.5% |
| GPQA Diamond | 75.4% |
| SWE-bench Verified | 72.7% |
| MATH-500 | 78.2% |
Performance
80
tok / sec
output speed
Source: Anthropic docs + llm-stats.com, April 2026
Strengths & Limitations
Best For
200K context window
Extended thinking mode
Excellent coding (SWE-bench)
Prompt caching for long contexts
Computer use capability
Limitations
Premium output pricing
Thinking mode adds latency
No audio modality
Tags
CodingReasoningLong ContextTool UseThinking