For BFSI · healthcare · government · regulated SaaS

Production-grade LLM infrastructure,
deployable on your terms.

Reserved India-hosted inference, intelligent routing across models, INR-billed contracts, audited compliance posture, and signed SLAs. Deploy in our cloud, your cloud, or your data centre — your security review chooses.

Book an enterprise demo →Estimate your saving →

Three deployment models. Pick the one your security review allows.

Same product surface, same SDK, same routing intelligence — the deployment topology adapts to your governance.

Relay Cloud

Fastest to start

From ₹30/M tokens

Inference: Reserved Yotta / Tata H100 in Mumbai + Hyderabad
Hosting: Our infrastructure
Latency: 20-80 ms in-region
Setup time: <48 hours
Best for: Mid-market enterprises moving fast; teams without infra ops capacity

Relay Dedicated

Most chosen

From ₹50L/year

Inference: Dedicated H100 capacity in your chosen Indian AZ
Hosting: Our infrastructure, dedicated tenancy
Latency: Predictable; capacity-guaranteed
Setup time: 2-4 weeks
Best for: BFSI, healthcare, regulated SaaS — predictable throughput + isolation

Relay Self-Hosted

Maximum control

Per-seat + support

Inference: Your hardware, your data centre
Hosting: Inside your VPC / on-prem
Latency: Local; depends on your infra
Setup time: 4-8 weeks
Best for: Government, defence, large BFSI with sovereign infra mandates

Compliance posture, in writing

We publish what we've actually implemented and what's in progress — not what we'll have one day. Your security team can audit the same documents we write to.

Shipping today

Column-level RLS on every billing-controlled column in the database
Append-only audit log for API key lifecycle (created / revoked / models changed)
Signed webhook events with replay protection (5-min window)
Atomic billing transactions; concurrent over-quota bursts cannot take balance negative
Encryption in transit (TLS 1.2+) and at rest (Supabase / Postgres AES-256)
Per-key model whitelist (the customer decides which models the router may use)
SAML SSO (via Supabase Auth SSO; per-tenant IdP configuration)
Documented data flow, sub-processor list, retention schedule (/privacy)
Public RUNBOOK and SECURITY policies in the source-available repository

On the SOC 2 / DPDP roadmap

SOC 2 Type I — auditor engaged, Q3 FY26 target
SOC 2 Type II — 6-month observation window following Type I
DPDP compliance program — currently aligned by architecture; formal documentation in progress
Penetration test — external firm engagement Q4 FY26
ISO 27001 — under evaluation; will pursue if multiple enterprise customers require it
Bug bounty program — staged for launch once Type I is in flight
Multi-region disaster recovery drill — quarterly cadence starting once dedicated capacity is reserved

Service-level agreements

Standard tiers below. Enterprise contracts can negotiate custom thresholds, response SLAs, and credit structures.

Tier	Uptime	Response time	Credits on breach	Support
Free / OSS	—	Community	—	GitHub issues
Hosted Pro	99.5%	Best-effort	—	Email, 48h
Enterprise Cloud	99.9%	P1 4h · P2 8h · P3 next business day	10-30% of monthly fee	Slack channel + email, business hours IST
Enterprise Dedicated	99.95%	P1 1h · P2 4h · P3 next business day	25-50% of monthly fee	Dedicated CSM + on-call escalation, 24/7

Contract mechanics built for Indian procurement

INR contracts + GST invoices

One invoice per month covering OSS-tier tokens, frontier passthrough, and reserved capacity. GST-compliant. Custom payment terms — NET 30, NET 45, NET 60.

Signed DPA + sub-processor list

Data Processing Addendum ready to attach to your master agreement. Sub-processors enumerated and disclosed (Yotta, Tata Communications, Supabase, etc.). Update notifications when the list changes.

Volume + commitment discounts

Reserved capacity contracts (12 or 24 month) earn 20-35% off list price. Multi-year commits earn further. Custom pricing on 1B+ tokens/month.

Source-available under NDA

The hosted-router source is available for your security review under NDA. The SDK is Apache-2.0 and on GitHub. No black-box layers in your inference path.

Migration assistance

Our engineering team works with yours on the pilot integration. Most teams are in production within 30 days of contract signing; the SDK is drop-in for any OpenAI-compatible caller.

Termination + portability

Apache-2.0 SDK + open data formats. If you ever leave Relay, your application code keeps working with standard SDKs directly. No vendor capture by design.

Talk to the engineering team

Send us your top three workload types, your current monthly burn, and your governance requirements. We'll come back within 48 hours with a custom estimate, deployment recommendation, and a draft contract structure.

enterprise@ai5labs.com Run the calculator first →

Production-grade LLM infrastructure,deployable on your terms.