Concept Paper · Decentralized AI Inference

PEER CYCLES

Verifiable, private AI compute.
Powered by people, proven by hardware.

A decentralized inference network where regular people share GPU compute and earn USDC — while every answer is proven to run the right model and your prompt stays private from the worker, both by one primitive: the trusted enclave.

Pay in
USDC credits
Workers keep
70%
Privacy
By construction
The trusted enclave — a sealed glass compute primitive
Attestation signed
Memory sealed
01 — The Problem

Decentralization fixed ownership.
It left two holes open.

Centralized providers log your prompts, filter your model, and revoke access at will. The first wave of alternatives gave you ownership — but skipped the two things that actually matter.

No proof of correctness

The worker claims it ran the model you paid for. You have no way to verify it didn't return a cached, cheaper, or corrupted output.

No real privacy

“We don't store prompts” is not privacy. The worker still reads your prompt in plaintext — privacy by policy, not by design.

Peer Cycles closes both. Trust stops being a promise and becomes a property of the system.

02 — Core Primitive

One enclave.
Both problems solved.

The keystone is the Trusted Execution Environment — a sealed region of a GPU or CPU that runs code no one can read or tamper with, not even the machine's owner. Peer Cycles runs on confidential GPUs (NVIDIA H100 / H200) plus CPU TEEs — Intel TDX and AMD SEV-SNP.

A Attestation

The enclave signs a proof of exactly which weights and which code ran. This delivers verifiability.

B Sealed memory

The worker never sees the prompt in plaintext. This delivers privacy — by construction.

Wireframe of the trusted enclave
The Trusted Enclave · Layer 0
03 — Architecture

Four layers, stacked on the enclave.

Every layer builds on the trust primitive beneath it.

L3

Demand wedge — OpenAI-compatible API

Any agent framework or app switches by changing one base URL. Apps and autonomous agents become customers, not just humans in a chat window. Per-request settlement runs over x402.

L2

Quality-routed market

The orchestrator routes by reputation, measured throughput, and price — never at random. Workers that fail proofs fall down the ranking and earn less. Good actors rise.

L1

Workers — three trust tiers

Supply spans three worker types, each serving a different trust tier.

Browser WebGPU / WebLLM

Fast, cheap inference in a browser tab. No privacy guarantee.

Native ollama · CUDA / Metal / Vulkan

Frontier open models on real hardware.

Confidential TEE on confidential GPU

Attested, encrypted. The premium tier.

L0

Trust primitive

TEE attestation. The foundation every other layer is built on.

04 — Verification Model

Correctness, priced honestly by cost.

Two production modes today — plus one research track we promise nothing about.

Default

Optimistic

The worker posts a bond. Each output commits to a hash, model ID, and a fixed seed — so the run reproduces. A random subset of jobs is re-executed by independent validators.

A mismatch slashes the worker and rewards the challenger. Slashing burns supply.

Premium

Attested

The TEE signs which weights and code ran. The proof travels with the response — forgery is not possible.

Cryptographic certainty, shipped alongside every token.

Research

zkML

A cheap zero-knowledge proof of inference. Production-grade zkML for large language models does not exist yet.

Tagged as research. We promise nothing.

Privacy Model

Plaintext never leaves your machine.

  1. 1

    The client encrypts the prompt to the enclave's attested public key.

  2. 2

    Decryption happens only inside the TEE. The worker never sees plaintext.

  3. 3

    The output encrypts back to the client. One mechanism covers privacy and correctness.

Live — runs on the real node

Watch the network actually run.

This isn't a recording. It streams from the live Peer Cycles node — the real orchestrator routes a job, proves it, seals a confidential prompt, catches a cheater, and settles the treasury. Every number below is computed on the spot.

peer-cycles@network — workflow
$ connect peer-cycles://network && run workflow
press Run workflow to stream a live job through every tier…

Open runs entirely on your device — nothing leaves the browser. Verified routes to a real GPU worker and settles per request in USDC/SOL on Solana mainnet.

05 — Tiers & Pricing

Pay per request, in credits.

1 credit = $0.01, bought with USDC. Credits never expire and refund automatically when a job fails. You never need to hold $PEER to spend them.

Open

Casual chat

~10credits
  • Browser inference
  • Fast, cheap
  • No guarantees

Confidential

Legal · medical · proprietary

~25–30credits
  • Full TEE
  • Encrypted & attested
  • Plaintext never exposed
70%Workers keep per job, paid in USDC
80%Stakers keep of network margin
TreasuryTakes the remaining margin
06 — Use the Network

One base URL.
Your agents become customers.

Peer Cycles speaks the OpenAI API. Point any agent framework or app at it by changing a single base URL — humans and autonomous agents alike. Per-request settlement runs over x402: no subscription, no token to hold, you pay only for what you run.

Send a request
# OpenAI-compatible — switch one base URL
curl https://<your-node>/v1/chat/completions \
  -H "authorization: Bearer <account>" \
  -H "content-type: application/json" \
  -d '{
    "model": "peer-mixtral",
    "peer_tier": "verified",
    "messages": [{ "role":"user", "content":"explain attestation" }]
  }'

Tiers: open · verified · confidential. Add "stream": true for token-by-token SSE.

Pay per request
# 1 credit = $0.01, bought with USDC. Refunds on failure.
curl .../credits/buy -d '{ "account":"me", "usdc": 10 }'

# Out of credits? The network answers x402:
HTTP/1.1 402 Payment Required
{ "scheme": "x402", "price_credits": 15 }

No $PEER needed to spend. confidential replies ship a verifiable TEE attestation; the worker never sees your prompt.

This is the live request shape — the reference node in this repo answers it verbatim.

07 — Job Lifecycle

From prompt to payout.

  1. 1

    User sends a request through the OpenAI-compatible API or chat client.

  2. 2

    The orchestrator reads the tier and queues the job.

  3. 3

    It matches the job to an eligible idle worker by reputation, speed, and price.

  4. 4

    The worker runs inference inside the enclave and streams tokens back.

  5. 5

    On Verified, a validator may re-execute a random subset of jobs; on Confidential, attestation ships with the output.

  6. 6

    The job completes. The worker earns USDC. A failed proof slashes the bond.

08 — Moat

The moat sits in staked security,
not in the code.

  • Bonded security. The staked $PEER behind honest work can't be cloned by a fork.
  • Reputation graph. Every worker's attestation history accrues over time and does not transfer.
  • Demand lock-in. OpenAI compatibility makes Peer Cycles the path of least resistance for agents and apps.
  • Confidential supply. Onboarded confidential-GPU operators are hard to replicate fast.
09 — Risks, stated honestly
  • Confidential GPUs are scarce and costly — they bottleneck the Confidential tier.
  • Verification adds cost and latency; the optimistic-to-attested balance needs tuning.
  • Demand depends on being genuinely cheaper or more private than centralized inference.
  • Security bootstraps slowly — launch must front-load staked supply.

Trust as a property,
not a promise.

Verifiable, private AI compute. Powered by people, proven by hardware.

$PEER