qwen/qwen3-vl-8b-thinking

$0.12

Input / 1M

$1.36

Output / 1M

131K

Context

—

Speed

Capabilities

json_modetoolsvision

Input modalities: image, text

Use qwen3-vl-8b-thinking via Relay

Configure the model alias in YAML, then call it from Python.

YAML

# models.yaml
version: 1
models:
  qwen3:
    target: qwen/qwen3-vl-8b-thinking
    credential: $env.QWEN_API_KEY

Python

from relay import Hub

async with Hub.from_yaml("models.yaml") as hub:
    resp = await hub.chat(
        "qwen3",
        messages=[{"role": "user", "content": "Hello"}],
    )
    print(resp.text, resp.cost_usd)

pip install ai5labs-relay · full docs on GitHub

Compare with

vs groq/llama-3.3-70b-versatile vs anthropic/claude-haiku-4-5 vs mistral/mistral-large-latest vs openai/gpt-4o-mini