Phi-4
Latest Other Phi Generally Available Dec 2024
Microsoft's 14B small language model focused on math and code. Outperforms many 70B models on MMLU (84.8) and MATH (80.4), with MIT licence and 600K+ HuggingFace downloads.
Context
16K
tokens
Input
—
per MTok
Output
—
per MTok
Modalities
Input
Text Code
Output
Text Code
Code Examples
curl https://openrouter.ai/api/v1/chat/completions \
-H "Authorization: Bearer $OPENROUTER_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "microsoft/phi-4",
"messages": [
{ "role": "user", "content": "Explain quantum entanglement in one sentence." }
]
}' API Parameters
| Name | Type | Description |
|---|---|---|
frequency_penalty | number | Penalize tokens by their frequency so far. Positive values reduce repetition. |
max_completion_tokens | integer | Maximum number of tokens the model may generate in the response. |
presence_penalty | number | Penalize tokens that have appeared at all so far. Positive values encourage new topics. |
response_format | one of | Constrain output to a JSON schema or an enum (structured outputs). |
seed | integer | Deterministic seed for sampling. Same seed + same prompt produces identical output. |
stop | array | Up to 4 sequences where the API will stop generating tokens. |
stream | boolean | Stream partial responses as Server-Sent Events. |
temperature | number | Sampling temperature; higher values produce more random output. 0 is deterministic. |
tool_choice | one of | Controls which (if any) tool is called: "none", "auto", "required", or a specific tool. |
tools | array | List of tools (functions) the model may call. |
top_p | number | Nucleus sampling: consider only tokens whose cumulative probability ≥ top_p. |
Standard OpenAI-compatible parameters. Consult the provider docs for model-specific behaviour.
Benchmark Scores
| Benchmark | Score |
|---|---|
| MMLU | 84.8% |
| HumanEval | 82.6% |
| MATH | 80.4% |
| GPQA Diamond | 56.1% |
Performance
130
tok / sec
output speed
Source: Phi-4 technical report (Microsoft Research) + HuggingFace, April 2026
Strengths & Limitations
Best For
Best-in-class MMLU for 14B models
MIT licence — fully commercial
Strong math (MATH 80.4)
600K+ HuggingFace downloads
Runs on single A100
Limitations
16K context only
No vision or multimodal
Training biased toward English
Tags
Open WeightsMathCodingEfficientSmall