Gemini 2.5 Pro

Featured Latest

Google Gemini 2.5 Generally Available May 2025

Google's most capable model — 1M token context, state-of-the-art reasoning with hybrid thinking, and top scores on AIME, GPQA, and long-context retrieval benchmarks.

Context

tokens

Input

$1.25

per MTok

Output

$10.00

per MTok

Model Page Try It API Docs

About

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...

Modalities

Input

Text Vision Audio Video Documents / PDFs Code

Output

Text Code

Code Examples

curl https://openrouter.ai/api/v1/chat/completions \
  -H "Authorization: Bearer $OPENROUTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2.5-pro",
    "messages": [
      { "role": "user", "content": "Explain quantum entanglement in one sentence." }
    ]
  }'

from openai import OpenAI

client = OpenAI(
    base_url="https://openrouter.ai/api/v1",
    api_key=os.environ["OPENROUTER_API_KEY"],
)
response = client.chat.completions.create(
    model="gemini-2.5-pro",
    messages=[
        {"role": "user", "content": "Explain quantum entanglement in one sentence."},
    ],
)
print(response.choices[0].message.content)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://openrouter.ai/api/v1",
  apiKey: process.env.OPENROUTER_API_KEY,
});

const response = await client.chat.completions.create({
  model: "gemini-2.5-pro",
  messages: [
    { role: "user", content: "Explain quantum entanglement in one sentence." },
  ],
});
console.log(response.choices[0].message.content);

API Parameters

Name	Type	Default	Description
`include_reasoning`	boolean	—	Include the model's internal reasoning trace in the response.
`max_tokens` deprecated	integer	—	Deprecated. Use max_completion_tokens.
`reasoning`	object	—	Configuration for extended-thinking / reasoning mode.
`response_format`	one of	—	Constrain output to a JSON schema or an enum (structured outputs).
`seed`	integer	—	Deterministic seed for sampling. Same seed + same prompt produces identical output.
`stop`	array	—	Up to 4 sequences where the API will stop generating tokens.
`structured_outputs`	boolean	—	Enable JSON-schema-constrained output.
`temperature`	number	1	Sampling temperature; higher values produce more random output. 0 is deterministic.
`tool_choice`	one of	—	Controls which (if any) tool is called: "none", "auto", "required", or a specific tool.
`tools`	array	—	List of tools (functions) the model may call.
`top_p`	number	1	Nucleus sampling: consider only tokens whose cumulative probability ≥ top_p.

Standard OpenAI-compatible parameters. Consult the provider docs for model-specific behaviour.

Benchmark Scores

Benchmark	Score	Methodology
MMLU	88.6%	5-shot
GPQA Diamond	84%	0-shot
MATH	92%	AIME 2024, 0-shot

Performance

tok / sec

output speed

Source: Google AI docs + Artificial Analysis, April 2026

Strengths & Limitations

Best For

1M token context window

Hybrid thinking mode

Multimodal (text/vision/audio/video)

Top reasoning benchmarks

Free API tier available

Limitations

Higher output price vs competitors

No prompt caching on Gemini API

Thinking mode increases latency