Verifiable inference network
Verifiable LLM inference,
anchored to physical work.
Orogen routes OpenAI-compatible inference to a vetted pool of GPU operators with TEE attestation, deterministic kernels, and validator replay sampling. Every job emits a signed receipt that customers, the network, and the chain can independently verify.
- OpenAI-compatible HTTP + SSE
- TEE-attested H100 / H200 / B200
- Burn-and-mint, USD-denominated payouts
Why Orogen
Inference that customers, operators, and the chain can all check.
Most DePIN-for-AI networks are bare GPU leasing markets — you rent a box, you trust the operator, you reconcile invoices off-protocol. Orogen is built the other way: the protocol owns the verification stack, the settlement layer, and the marketplace. Every layer is pallet-enforced and every job emits a receipt anyone can verify against the chain.
Drop-in OpenAI compatibility
LiteLLM-fronted gateway exposes /v1/chat/completions, /v1/embeddings, and SSE streaming. Point your existing OpenAI SDK at the Orogen base URL and your code works unchanged.
TEE-anchored receipts, not vibes
Every job returns a signed receipt bound to (model_hash, adapter_hash, prompt_hash, output_hash, attestation_quote). The chain stores the commitment; validators sample-replay 1–5% of jobs on identical hardware to detect drift.
USD-denominated payouts for operators
Burn-and-mint loop quotes operator rewards at oracle spot at job time. Default 70% auto-swap to USDC at protocol AMM. The 'I keep my GPU running because of speculation' failure mode is structurally prevented.
Demand-elastic emission
Year 3+ rule is pallet-enforced: mint ≤ rolling-90d burn × elasticity factor. Hard 5% supply/year ceiling. 0.5% floor for trough resilience. No foundation mint discretion, no halving.
How it works
Three steps. One loop.
The burn-and-mint engine (BME) closes every customer transaction in three on-chain steps. Read the long form in the verification stack walkthrough.
- 1
Burn
Customer tops up the gateway in USDC/USD/ETH/BTC. The gateway burns OROG at oracle spot and mints non-transferable Compute Unit Credit (CUC), USD-pegged, 30-day expiry. No skip-the-token backdoor.
- 2
Route + execute
Multi-objective router picks an operator by latency target, price tier, KV-cache locality, and reputation. Worker runs vLLM V1 / SGLang / TRT-LLM with deterministic kernels in TEE; signed response receipt returned over the wire.
- 3
Verify + settle
Validators sample 1–5% of jobs and replay on identical hardware; mismatches trigger an opML challenge window. On finalisation, CUC is consumed and fresh OROG mints — 75% to the operator, 15% to verification work, 5% treasury, 5% governance.
customer USDC ──▶ gateway burns OROG at oracle spot ──▶ CUC minted (USD-pegged, 30d)
│
▼
router picks operator by tier + region
│
▼
worker runs vLLM / SGLang / TRT-LLM in TEE
deterministic kernel, signed response receipt
│
▼
validators sample-replay 1–5% on-chain
│
▼
CUC burned → fresh OROG mints: 75% operator / 15%
verification / 5% treasury / 5% governance
Hardware tiers
Six tiers. One protocol.
Operators self-declare a tier at registration. The tier commits them to an SLA (latency, verification posture, kernel determinism). Customers pick a tier per request; the protocol does the routing. Stake floors and pricing multiples are at-launch parameters and governance-adjustable.
| Tier | Hardware floor | Verification | Typical models | Min stake | Use case |
|---|---|---|---|---|---|
| dc-premium | 8× B200 / 8× H200 / NVL72 | L1+L2+L3+L4 (1–5% sampled) | DeepSeek-V3 671B, Llama-4-MoE, frontier MoE | 500 OROG | Enterprise frontier-tier, premium latency SLA |
| dc-standard | 8× H100 SXM | L1+L2+L3+L4 (1–5% sampled) | 30–70B dense, Mixtral, large MoE | 100 OROG | Mainstream production inference |
| cloud-rented | 1–2× H100 PCIe / H200 | L1+L2+L3+L4 (5–10% sampled) | 7–30B dense | 50 OROG | Spot capacity, secondary regions |
| prosumer | 1–2× RTX 5090 / PRO 6000 | L1 stake; L3 best-effort; L4 (10%+ sampled) | 7–14B quantized | 25 OROG | Lower-tier, hobbyist-friendly |
| edge | Mac Studio Ultra, dual 3090 | Stake-only | ≤ 32B single-user | 0 OROG (deposit only) | Private single-tenant inference |
| embed-only | CPU AVX-512, Apple M-series | Stake-only, optional L6 zkML | Embeddings, re-ranking, classification | 0 OROG (deposit only) | Cheap embedding + classification flows |
Stake floors are at-launch parameters; governance can adjust ±20% per epoch with timelock. Verification layers are described in How it works.
Sourced from the research dossier
Not claims. Citations.
The Orogen design is the result of a long survey of every prior DePIN-for-AI network. Three excerpts from the design dossier that drove the architecture choices:
"Bittensor SN3 ran 70+ peers across 3 continents at 500/110 Mbps commodity bandwidth — the operator base for a global verifiable inference network already exists; what's missing is a settlement layer with real burn-mint and a credible verification stack."
"Akash's post-BME redesign shifted from a 13–15% inflationary regime to revenue-pegged net deflation when demand runs hot. The architecture is repeatable; the missing piece is anchoring it to physical-work verification rather than community signalling."
"Phala TEE attestation routes >1B tokens/day for OpenRouter clients today. The TEE substrate works in production. What's missing is the economic layer that captures that demand as token-denominated burn."
Ready to integrate?
The Python and TypeScript SDKs are OpenAI-compatible drop-ins. The CLI wallet and chain node binaries are available for self-hosters and operators preparing for the Forge testnet.