Verifiable inference network

Verifiable LLM inference,
anchored to physical work.

Orogen routes OpenAI-compatible inference to a vetted pool of GPU operators with TEE attestation, deterministic kernels, and validator replay sampling. Every job emits a signed receipt that customers, the network, and the chain can independently verify.

Download SDK How it works Read the docs

OpenAI-compatible HTTP + SSE
TEE-attested H100 / H200 / B200
Burn-and-mint, USD-denominated payouts

plate 01 · verification stack scale 1:n

Why Orogen

Inference that customers, operators, and the chain can all check.

Most DePIN-for-AI networks are bare GPU leasing markets — you rent a box, you trust the operator, you reconcile invoices off-protocol. Orogen is built the other way: the protocol owns the verification stack, the settlement layer, and the marketplace. Every layer is pallet-enforced and every job emits a receipt anyone can verify against the chain.

Drop-in OpenAI compatibility

LiteLLM-fronted gateway exposes /v1/chat/completions, /v1/embeddings, and SSE streaming. Point your existing OpenAI SDK at the Orogen base URL and your code works unchanged.

TEE-anchored receipts, not vibes

Every job returns a signed receipt bound to (model_hash, adapter_hash, prompt_hash, output_hash, attestation_quote). The chain stores the commitment; validators sample-replay 1–5% of jobs on identical hardware to detect drift.

USD-denominated payouts for operators

Burn-and-mint loop quotes operator rewards at oracle spot at job time. Default 70% auto-swap to USDC at protocol AMM. The 'I keep my GPU running because of speculation' failure mode is structurally prevented.

Demand-elastic emission

Year 3+ rule is pallet-enforced: mint ≤ rolling-90d burn × elasticity factor. Hard 5% supply/year ceiling. 0.5% floor for trough resilience. No foundation mint discretion, no halving.

How it works

Three steps. One loop.

The burn-and-mint engine (BME) closes every customer transaction in three on-chain steps. Read the long form in the verification stack walkthrough.

1
Burn

Customer tops up the gateway in USDC/USD/ETH/BTC. The gateway burns OROG at oracle spot and mints non-transferable Compute Unit Credit (CUC), USD-pegged, 30-day expiry. No skip-the-token backdoor.
2
Route + execute

Multi-objective router picks an operator by latency target, price tier, KV-cache locality, and reputation. Worker runs vLLM V1 / SGLang / TRT-LLM with deterministic kernels in TEE; signed response receipt returned over the wire.
3
Verify + settle

Validators sample 1–5% of jobs and replay on identical hardware; mismatches trigger an opML challenge window. On finalisation, CUC is consumed and fresh OROG mints — 75% to the operator, 15% to verification work, 5% treasury, 5% governance.

  customer USDC ──▶ gateway burns OROG at oracle spot ──▶ CUC minted (USD-pegged, 30d)
                                                                  │
                                                                  ▼
                                                 router picks operator by tier + region
                                                                  │
                                                                  ▼
                                       worker runs vLLM / SGLang / TRT-LLM in TEE
                                       deterministic kernel, signed response receipt
                                                                  │
                                                                  ▼
                                              validators sample-replay 1–5% on-chain
                                                                  │
                                                                  ▼
                                   CUC burned → fresh OROG mints: 75% operator / 15%
                                       verification / 5% treasury / 5% governance

Hardware tiers

Six tiers. One protocol.

Operators self-declare a tier at registration. The tier commits them to an SLA (latency, verification posture, kernel determinism). Customers pick a tier per request; the protocol does the routing. Stake floors and pricing multiples are at-launch parameters and governance-adjustable.

Orogen hardware tier matrix
Tier	Hardware floor	Verification	Typical models	Min stake	Use case
dc-premium	8× B200 / 8× H200 / NVL72	L1+L2+L3+L4 (1–5% sampled)	DeepSeek-V3 671B, Llama-4-MoE, frontier MoE	500 OROG	Enterprise frontier-tier, premium latency SLA
dc-standard	8× H100 SXM	L1+L2+L3+L4 (1–5% sampled)	30–70B dense, Mixtral, large MoE	100 OROG	Mainstream production inference
cloud-rented	1–2× H100 PCIe / H200	L1+L2+L3+L4 (5–10% sampled)	7–30B dense	50 OROG	Spot capacity, secondary regions
prosumer	1–2× RTX 5090 / PRO 6000	L1 stake; L3 best-effort; L4 (10%+ sampled)	7–14B quantized	25 OROG	Lower-tier, hobbyist-friendly
edge	Mac Studio Ultra, dual 3090	Stake-only	≤ 32B single-user	0 OROG (deposit only)	Private single-tenant inference
embed-only	CPU AVX-512, Apple M-series	Stake-only, optional L6 zkML	Embeddings, re-ranking, classification	0 OROG (deposit only)	Cheap embedding + classification flows

Stake floors are at-launch parameters; governance can adjust ±20% per epoch with timelock. Verification layers are described in How it works.

Sourced from the research dossier

Not claims. Citations.

The Orogen design is the result of a long survey of every prior DePIN-for-AI network. Three excerpts from the design dossier that drove the architecture choices:

"Bittensor SN3 ran 70+ peers across 3 continents at 500/110 Mbps commodity bandwidth — the operator base for a global verifiable inference network already exists; what's missing is a settlement layer with real burn-mint and a credible verification stack."

Research dossier — ref/bittensor/02-token-economics.md

"Akash's post-BME redesign shifted from a 13–15% inflationary regime to revenue-pegged net deflation when demand runs hot. The architecture is repeatable; the missing piece is anchoring it to physical-work verification rather than community signalling."

Research dossier — ref/economics/render-akash-aethir.md

"Phala TEE attestation routes >1B tokens/day for OpenRouter clients today. The TEE substrate works in production. What's missing is the economic layer that captures that demand as token-denominated burn."

Research dossier — ref/landscape/phala.md

Ready to integrate?

The Python and TypeScript SDKs are OpenAI-compatible drop-ins. The CLI wallet and chain node binaries are available for self-hosters and operators preparing for the Forge testnet.

Download SDK Run a node

Verifiable LLM inference, anchored to physical work.