For finance, procurement, and engineering leadership
What would your LLM bill look like on India-hosted infrastructure?
Enter your current monthly burn and we'll estimate the spend-repatriation saving from routing OSS-eligible workloads to India-hosted Llama / DeepSeek / Qwen, with the remainder still served by Claude / Gemini via in-region endpoints. The math updates live; nothing leaves your browser.
Roughly what you spend across all your LLM usage per month today.
The rest is assumed to be mini-class (GPT-4o-mini, Haiku, Flash).
DPDP + sector residency rules push routing to in-region OSS regardless of cost.
Calculations are illustrative. Assumes India-hosted Llama 3.3 70B at ₹30/M tokens (Relay list price) for OSS-routable workloads. Frontier workloads stay on your current vendor. INR/USD: 83.
That's 0% off your current bill. Annualised: ₹0.
Or email us directly for a custom estimate with your real workload mix.
How we compute it
Your USD spend is reconstructed into implied tokens using the published rates for the model classes you indicate. We assume Relay routes a workload-appropriate share to India-hosted Llama 3.3 70B at our list price (₹30/M tokens). The frontier slice stays on your existing vendor; the saving is the difference.
Where the OSS-share number comes from
Industry presets reflect what our design-partner pilots are actually routing: BFSI 85% (DPDP + sector rules force the decision), healthcare 90% (patient data residency), AI-native 60% (more demanding mix). You can override by changing the industry to match your reality.
What the calculator does NOT include
Productivity gains from faster latency in-region. The cost of compliance / DPDP / RBI scrutiny if you stay on US-hosted endpoints. Engineering effort to migrate (Relay is drop-in for OpenAI / Anthropic SDK callers; budget ~30 min per service). Routing decisions made per-request can shift OSS-share higher than the preset.
Want a custom estimate with your real workload mix?
Send us your top three workload types and approximate per-month volume. We'll come back within 48 hours with a per-workload routing recommendation and a tighter saving estimate, under NDA if needed.