google/gemini-2.5-flash

Aliases: gemini-flash, 2.5-flash, flash

$0.30

Input / 1M

$2.50

Output / 1M

1.0M

Context

250 t/s

Speed

Public benchmark scores

Sourced from each provider's published numbers. Verify before quoting.

Quality index

MMLU

85.3

GPQA

65.2

HumanEval

88.1

MATH

SWE-bench

—

Arena Elo

—

Sources: google-deepmind-blog

Capabilities

toolsvisionaudio_inputjson_modestructured_outputstreamingthinking

Input modalities: file, image, text, audio, video

Use gemini-2.5-flash via Relay

Configure the model alias in YAML, then call it from Python.

YAML

# models.yaml
version: 1
models:
  gemini-flash:
    target: google/gemini-2.5-flash
    credential: $env.GOOGLE_API_KEY

Python

from relay import Hub

async with Hub.from_yaml("models.yaml") as hub:
    resp = await hub.chat(
        "gemini-flash",
        messages=[{"role": "user", "content": "Hello"}],
    )
    print(resp.text, resp.cost_usd)

pip install ai5labs-relay · full docs on GitHub

Compare with

vs anthropic/claude-3-5-sonnet-20241022 vs openai/gpt-4o vs deepseek/deepseek-chat vs xai/grok-3