For BFSI · healthcare · government · regulated SaaS
Production-grade LLM infrastructure,
deployable on your terms.
Reserved India-hosted inference, intelligent routing across models, INR-billed contracts, audited compliance posture, and signed SLAs. Deploy in our cloud, your cloud, or your data centre — your security review chooses.
Three deployment models. Pick the one your security review allows.
Same product surface, same SDK, same routing intelligence — the deployment topology adapts to your governance.
Relay Cloud
Fastest to startFrom ₹30/M tokens
- Inference
- Reserved Yotta / Tata H100 in Mumbai + Hyderabad
- Hosting
- Our infrastructure
- Latency
- 20-80 ms in-region
- Setup time
- <48 hours
- Best for
- Mid-market enterprises moving fast; teams without infra ops capacity
Relay Dedicated
Most chosenFrom ₹50L/year
- Inference
- Dedicated H100 capacity in your chosen Indian AZ
- Hosting
- Our infrastructure, dedicated tenancy
- Latency
- Predictable; capacity-guaranteed
- Setup time
- 2-4 weeks
- Best for
- BFSI, healthcare, regulated SaaS — predictable throughput + isolation
Relay Self-Hosted
Maximum controlPer-seat + support
- Inference
- Your hardware, your data centre
- Hosting
- Inside your VPC / on-prem
- Latency
- Local; depends on your infra
- Setup time
- 4-8 weeks
- Best for
- Government, defence, large BFSI with sovereign infra mandates
Compliance posture, in writing
We publish what we've actually implemented and what's in progress — not what we'll have one day. Your security team can audit the same documents we write to.
Shipping today
- Column-level RLS on every billing-controlled column in the database
- Append-only audit log for API key lifecycle (created / revoked / models changed)
- Signed webhook events with replay protection (5-min window)
- Atomic billing transactions; concurrent over-quota bursts cannot take balance negative
- Encryption in transit (TLS 1.2+) and at rest (Supabase / Postgres AES-256)
- Per-key model whitelist (the customer decides which models the router may use)
- SAML SSO (via Supabase Auth SSO; per-tenant IdP configuration)
- Documented data flow, sub-processor list, retention schedule (/privacy)
- Public RUNBOOK and SECURITY policies in the source-available repository
On the SOC 2 / DPDP roadmap
- SOC 2 Type I — auditor engaged, Q3 FY26 target
- SOC 2 Type II — 6-month observation window following Type I
- DPDP compliance program — currently aligned by architecture; formal documentation in progress
- Penetration test — external firm engagement Q4 FY26
- ISO 27001 — under evaluation; will pursue if multiple enterprise customers require it
- Bug bounty program — staged for launch once Type I is in flight
- Multi-region disaster recovery drill — quarterly cadence starting once dedicated capacity is reserved
Service-level agreements
Standard tiers below. Enterprise contracts can negotiate custom thresholds, response SLAs, and credit structures.
| Tier | Uptime | Response time | Credits on breach | Support |
|---|---|---|---|---|
| Free / OSS | — | Community | — | GitHub issues |
| Hosted Pro | 99.5% | Best-effort | — | Email, 48h |
| Enterprise Cloud | 99.9% | P1 4h · P2 8h · P3 next business day | 10-30% of monthly fee | Slack channel + email, business hours IST |
| Enterprise Dedicated | 99.95% | P1 1h · P2 4h · P3 next business day | 25-50% of monthly fee | Dedicated CSM + on-call escalation, 24/7 |
Contract mechanics built for Indian procurement
INR contracts + GST invoices
One invoice per month covering OSS-tier tokens, frontier passthrough, and reserved capacity. GST-compliant. Custom payment terms — NET 30, NET 45, NET 60.
Signed DPA + sub-processor list
Data Processing Addendum ready to attach to your master agreement. Sub-processors enumerated and disclosed (Yotta, Tata Communications, Supabase, etc.). Update notifications when the list changes.
Volume + commitment discounts
Reserved capacity contracts (12 or 24 month) earn 20-35% off list price. Multi-year commits earn further. Custom pricing on 1B+ tokens/month.
Source-available under NDA
The hosted-router source is available for your security review under NDA. The SDK is Apache-2.0 and on GitHub. No black-box layers in your inference path.
Migration assistance
Our engineering team works with yours on the pilot integration. Most teams are in production within 30 days of contract signing; the SDK is drop-in for any OpenAI-compatible caller.
Termination + portability
Apache-2.0 SDK + open data formats. If you ever leave Relay, your application code keeps working with standard SDKs directly. No vendor capture by design.
Talk to the engineering team
Send us your top three workload types, your current monthly burn, and your governance requirements. We'll come back within 48 hours with a custom estimate, deployment recommendation, and a draft contract structure.