Skip to content

Verifiable LLM Execution

OpenGradient provides TEE-based secure and verifiable LLM inference as core infrastructure for running large language models on a decentralized network. All LLM requests are routed through Trusted Execution Environments (TEEs), which provide hardware-attested verification that inference was executed correctly and that specific prompts were used.

This TEE infrastructure powers all LLM access methods on OpenGradient:

  • Python SDK: High-level Python API that handles payment, signing, and verification automatically
  • x402 Gateway: Direct HTTP access using the x402 payment protocol, enabling integration from any language or platform

Both methods use the same underlying TEE infrastructure and provide identical security guarantees.

TIP

For ML model execution using PIPE with ZKML, TEE, and Vanilla verification, see ML Execution.

Key Features

  • TEE Verification: All LLM inferences are verified using hardware-attested Trusted Execution Environments
  • Provable Prompt Usage: Cryptographically prove which prompts were used for any inference, enabling transparent verification of agent actions and decision-making
  • On-Chain TEE Registry: TEE nodes are registered and verified on-chain via blockchain consensus, eliminating any single point of trust
  • Payment-Gated Access: Secure, cryptographically-verified payment before inference execution via x402
  • Multi-Chain Payments: Payment settlement supported on both OpenGradient network and Base
  • Universal Access: Standard HTTP/REST APIs accessible from any language via x402, or simplified Python integration via SDK
  • Low Latency: Off-chain execution with on-chain payment and proof settlement

How It Works

Both the Python SDK and x402 Gateway use the same underlying payment-gated TEE inference flow. The SDK abstracts this complexity, while x402 gives you direct control over each step.

TIP

You can read more on the x402 standard here.

1. TEE Node Registration

Before serving any inference requests, TEE nodes must be registered and verified on-chain through the TEE Registry. This ensures that every node a client connects to is running approved code inside attested hardware.

2. Initial Request

The client makes an HTTP request to the LLM inference endpoint:

http
POST /v1/chat/completions HTTP/1.1
Host: llmogevm.opengradient.ai
Content-Type: application/json

{
  "model": "openai/gpt-4o",
  "messages": [{"role": "user", "content": "Explain quantum computing in simple terms"}],
  "max_tokens": 200,
  "temperature": 0.7
}

3. Payment Requirement Response

The server responds with a 402 PAYMENT-REQUIRED status and payment details:

http
HTTP/1.1 402 Payment Required
X-PAYMENT-REQUIRED: {
  "amount": "0.001",
  "currency": "OUSDC",
  "chain_id": 10740,
  "payment_id": "0x1234...",
  "expires_at": "2024-01-15T10:30:00Z"
}

NOTE

The HTTP response format shown in these examples is illustrative. Actual responses from the API may differ in structure or field names. Always refer to the actual API responses when implementing integrations.

4. Payment Creation

The client creates a payment payload and cryptographically signs it:

javascript
// Example payment payload creation
const paymentPayload = {
  payment_id: "0x1234...",
  amount: "0.001",
  currency: "OUSDC",
  chain_id: 10740,
  timestamp: Date.now(),
  nonce: generateNonce()
};

// Sign the payment payload
const signature = await wallet.signMessage(
  JSON.stringify(paymentPayload)
);

5. Payment Submission

The client resubmits the request with the payment signature:

http
POST /v1/chat/completions HTTP/1.1
Host: llmogevm.opengradient.ai
Content-Type: application/json
X-PAYMENT: {
  "payload": {...},
  "signature": "0xabcd...",
  "address": "0x742d35Cc6634C0532925a3b844Bc9e7595f0bEb"
}

{
  "model": "openai/gpt-4o",
  "messages": [{"role": "user", "content": "Explain quantum computing in simple terms"}],
  "max_tokens": 200,
  "temperature": 0.7
}

6. Payment Verification

The server (or optional Facilitator) verifies the payment signature. The LLM server at https://llmogevm.opengradient.ai handles payment verification internally, which may involve interaction with the facilitator contract at address 0x339c7de83d1a62edafbaac186382ee76584d294f.

7. Inference Execution with TEE Verification

Once payment is verified, the server executes the LLM inference on OpenGradient's decentralized network using TEE nodes. The inference is routed through TEE nodes to third-party LLM APIs, and the results are returned with TEE attestation:

http
HTTP/1.1 200 OK
Content-Type: application/json
X-PAYMENT-RESPONSE: {
  "payment_id": "0x1234...",
  "tx_hash": "0x5678...",
  "settled": true
}

{
  "model": "openai/gpt-4o",
  "completion": "Quantum computing is a revolutionary computing paradigm...",
  "finish_reason": "stop",
  "usage": {
    "prompt_tokens": 12,
    "completion_tokens": 187,
    "total_tokens": 199
  },
  "verification": {
    "method": "TEE",
    "proof": "0x9abc...",
    "verified_by": "opengradient-network"
  }
}

NOTE

The HTTP response format shown in these examples is illustrative. Actual responses from the API may differ in structure or field names. Always refer to the actual API responses when implementing integrations.

8. Payment Settlement

After inference execution, the payment is settled on-chain. Payments can be settled on either the OpenGradient network or Base, giving users flexibility in how they pay for inference. The LLM server at llmogevm.opengradient.ai handles payment settlement internally, which may involve interaction with the facilitator contract to submit the transaction to the blockchain.

9. LLM Settlement (Proof Verification)

After the inference is executed and the TEE attestation is generated, the proof of TEE inference is posted and verified on the blockchain. This LLM settlement process ensures that:

  • Proof Posting: The TEE attestation proof is posted to the blockchain as part of the settlement transaction
  • On-Chain Verification: The proof is verified on-chain by validators to ensure the inference was executed correctly
  • Immutable Record: The proof and inference results are permanently recorded on-chain for auditability and transparency

The settlement transaction includes:

  • The TEE attestation proof
  • Inference data (varies by settlement mode - see Settlement Modes below)
  • Payment settlement information
  • Timestamp and block information

Once the proof is posted and verified on-chain, the inference execution is considered fully settled and verified. This provides cryptographic guarantees that the LLM inference was executed correctly within the trusted execution environment.

TIP

You can find and verify settlement transactions on the OpenGradient block explorer.

Settlement Modes

Clients can choose from three modes of settlement, each offering different levels of on-chain data visibility:

  • SETTLE_INDIVIDUAL: Includes only input/output hashes for individual inference. This is the most gas-efficient option, storing minimal data on-chain while still providing cryptographic proof of execution. Input hashes enable verification of which prompts were used, allowing you to verify the full prompt data against the hash when needed.

  • SETTLE_BATCH: Batch hashes for multiple inferences. Useful for applications that need to settle multiple inferences in a single transaction, reducing gas costs per inference. Like SETTLE_INDIVIDUAL, batch hashes provide cryptographic proof of prompt usage for all inferences in the batch.

  • SETTLE_INDIVIDUAL_WITH_METADATA: Includes full model information, complete input and output data, and all inference metadata. This mode provides full visibility on block explorers and is useful for publicly traceable and verifiable execution.

ModeOn-Chain DataSignature VerifiedBest For
SETTLE_INDIVIDUALInput/output hashesYesAgent reasoning that handles money or makes business decisions
SETTLE_BATCHBatch hashesNoSecure inference for chatbots and high-throughput applications
SETTLE_INDIVIDUAL_WITH_METADATAFull input, output, and metadataYesPublicly traceable and verifiable execution (e.g. auditable DeFi agents)

TEE Verification & Trust Model

All LLM inference on OpenGradient uses TEE (Trusted Execution Environments) for verification. TEE nodes route LLM requests to third-party LLM APIs (like OpenAI, Anthropic, etc.) while providing hardware-attested cryptographic proof that the inference was executed correctly and that specific prompts were used.

NOTE

TEE verification is the standard and only verification method for LLM inference on OpenGradient (both SDK and x402 Gateway). For ML execution with multiple verification options (ZKML, TEE, Vanilla), see ML Execution.

How TEE Verification Works

  • LLM requests are routed through TEE nodes to third-party LLM providers
  • TEE nodes provide hardware-attested verification of the inference execution
  • The attestation proves that the inference was executed correctly within the trusted environment
  • All inferences are verified by OpenGradient's decentralized network using TEE attestation

TEE provides strong security guarantees with negligible performance overhead, making it ideal for LLM inference.

TEE Registry

Every TEE node on OpenGradient must be registered on-chain before it can serve inference requests. The TEE Registry is a smart contract that implements the ITEERegistry.sol interface, deployed on the OpenGradient blockchain. This means every registration and verification operation is executed by all validators as part of blockchain consensus. This eliminates any single point of trust — no single party can fraudulently register a TEE or tamper with stored certificates and keys, because every validator independently executes the same verification logic and must agree on the result.

This registration process cryptographically binds the enclave's identity to the blockchain, ensuring that users can independently verify they are communicating with a legitimate, approved enclave.

Registration Parameters

TEE nodes are registered via the registerTEEWithAttestation method on the TEE Registry contract with the following parameters:

ParameterDescription
attestationRaw AWS Nitro attestation document
signingPublicKeyRSA public key used by the enclave to sign settlement transactions
tlsCertificateTLS certificate generated inside the enclave, used for HTTPS connections
paymentAddressAddress where the TEE receives payments
endpointIP or DNS endpoint of the enclave
teeTypeTEE type identifier

Verification Checks

Registration only succeeds after the contract performs the following checks:

  1. Caller authorization — Only admin accounts can register TEEs.
  2. TEE type validity — The specified TEE type must exist and be active.
  3. Attestation authenticity — The AWS Nitro attestation document is verified against the AWS root certificate, proving it was issued by genuine Nitro hardware.
  4. Approved code verification — PCR values (PCR0, PCR1, PCR2) are extracted from the attestation and compared against the list of approved PCR hashes stored on-chain. The list of approved PCR hashes is managed on-chain through an approval process, ensuring transparency over which enclave code is permitted to run. This proves the enclave is running exact approved code.
  5. TLS certificate binding — The contract computes SHA256(TLS certificate public key) and verifies it matches the hash embedded in the attestation's user_data field. This proves the TLS certificate was generated inside this specific enclave instance.
  6. Signing key binding — The contract computes SHA256(signing public key) and verifies it matches the hash in the attestation's user_data field. This proves the signing key genuinely originated from this enclave.

On-Chain Registry Data

Once all verifications pass, the following data is stored on-chain:

  • TEE ID (derived from the signing public key)
  • Owner and payment address
  • Enclave endpoint
  • TLS certificate — users download this to establish secure HTTPS connections
  • Signing public key — the blockchain uses this to verify settlement transaction signatures
  • PCR hash, TEE type, and registration timestamps

Secure Connection Establishment

When a user connects to a TEE node, trust is established through the on-chain registry rather than through traditional certificate authorities:

  1. Certificate retrieval — The user downloads the TLS certificate for the target TEE from the blockchain.
  2. Connection establishment — The user initiates an HTTPS connection to the TEE's endpoint using the on-chain certificate as the trusted root.
  3. Cryptographic guarantee — Because the registry verified that the TLS certificate's public key hash matches the attestation's user_data, the user has cryptographic proof that the TLS endpoint is served by the attested enclave — not a man-in-the-middle.

This means users do not need to trust any external certificate authority. The chain of trust flows directly from AWS Nitro hardware attestation, through the on-chain registry, to the TLS connection.

Settlement & Response Verification

After inference is executed, the facilitator posts a settlement transaction to the blockchain, creating an immutable on-chain record for every inference. This record is privacy-preserving — only hashes of the input and output are stored on-chain by default (unless SETTLE_INDIVIDUAL_WITH_METADATA mode is used).

As part of settlement, the blockchain verifies the TEE node's signature on the inference response by checking it against the signing public key registered in the TEE Registry. This ensures the response genuinely came from an attested enclave and was not tampered with. Note that in batch settlement mode, individual inference signatures are not currently verified on-chain, though all other guarantees — including TEE registration, attestation, and secure connections — still apply.

Security Guarantees

After a TEE passes registration, the following guarantees hold:

  • Code integrity — The enclave is running approved code, verified via PCR values from the hardware attestation against the on-chain list of approved PCR hashes.
  • Key authenticity — Both the TLS certificate and the signing key provably originated from within that specific enclave instance, verified via hash binding in the attestation's user_data.
  • Connection security — Users can establish end-to-end encrypted connections to the enclave using the on-chain TLS certificate, with no reliance on external certificate authorities.
  • Settlement integrity — Every inference produces an immutable on-chain record. The blockchain verifies that inference responses were signed by a registered, attested enclave using the on-chain signing public key before accepting the settlement.
  • Consensus-backed verification — All registry and settlement operations are executed by every validator as part of blockchain consensus, so no single party can subvert the verification process.

Facilitators

Facilitators are optional services that handle payment verification and settlement complexity. OpenGradient provides a facilitator service and endpoint (though others can run facilitator services too). Facilitators provide:

  • Payment Verification: Cryptographic verification of payment signatures
  • Settlement Management: On-chain transaction submission and confirmation
  • Payment Method Abstraction: Support for multiple payment methods (stablecoins, crypto, fiat)
  • Rate Limiting & Quotas: Usage tracking and rate limiting
  • Receipt Generation: Transaction receipts and audit trails

OpenGradient Facilitator:

  • Endpoint: llmogevm.opengradient.ai (LLM server endpoint that handles facilitator interactions)
  • Facilitator Address: 0x339c7de83d1a62edafbaac186382ee76584d294f

When you send requests to llmogevm.opengradient.ai, the LLM server handles payment verification and settlement internally, interacting with the facilitator contract as needed.

NOTE

Facilitators are optional. Servers can handle payment verification and settlement directly when accepting stablecoins or crypto payments. OpenGradient provides a facilitator contract at address 0x339c7de83d1a62edafbaac186382ee76584d294f, but others can also deploy and use their own facilitator contracts.

Supported Models

OpenGradient's TEE LLM infrastructure supports the following models (accessible via both SDK and x402):

  • openai/gpt-4.1
  • openai/gpt-4o
  • anthropic/claude-4.0-sonnet
  • anthropic/claude-3.5-haiku
  • x-ai/grok-3-beta
  • x-ai/grok-3-mini-beta
  • x-ai/grok-4-1-fast-non-reasoning
  • google/gemini-2.5-flash-preview
  • google/gemini-2.5-pro-preview

These models are routed through TEE nodes to third-party LLM APIs. For more information on TEE LLMs, see TEE LLMs.

Integration Example

The OpenGradient Python SDK provides high-level abstractions that handle the x402 payment flow automatically:

python
import opengradient as og

# Initialize SDK
client = og.Client(
    private_key="<private_key>",
    email=None,
    password=None
)

# Run LLM inference via x402
result = client.llm.completion(
    model=og.TEE_LLM.GPT_4O,
    prompt="Explain quantum computing in simple terms",
    max_tokens=200,
    temperature=0.7
)

print("Completion:", result.completion_output)
print("Payment hash:", result.payment_hash)

TIP

For more details on using the Python SDK for LLM inference, see the LLM SDK Guide. For direct HTTP integration using the x402 protocol in TypeScript, Go, and other languages, see the x402 Gateway documentation.

Use Cases

OpenGradient's TEE LLM infrastructure is ideal for:

  • LLM-as-a-Service: Building private and verifiable LLM inference services
  • AI Agents with Provable Actions: Building autonomous agents where you can cryptographically prove which prompt was used to take a specific action, enabling full transparency and auditability for agent decisions
  • Resolution and Decision Verification: Verifying that resolutions or decisions used the correct prompt with accurate data inputs, ensuring fair and transparent outcomes with on-chain proof
  • Web Applications: Integrating AI capabilities into web apps via REST APIs
  • Microservices: Adding AI inference to existing microservice architectures
  • Content Generation: Building content generation tools and applications
  • Chat Applications: Creating chat interfaces with verified LLM backends
  • API Gateways: Providing AI inference through API gateways and proxies

Next Steps