Verifiable LLM Execution
OpenGradient provides TEE-based secure and verifiable LLM inference as core infrastructure for running large language models on a decentralized network. All LLM requests are routed through Trusted Execution Environments (TEEs), which provide hardware-attested verification that inference was executed correctly and that specific prompts were used.
This TEE infrastructure powers all LLM access methods on OpenGradient:
- Python SDK: High-level Python API that handles payment, signing, and verification automatically
- x402 Gateway: Direct HTTP access using the x402 payment protocol, enabling integration from any language or platform
Both methods use the same underlying TEE infrastructure and provide identical security guarantees.
TIP
For ML model execution using PIPE with ZKML, TEE, and Vanilla verification, see ML Execution.
Key Features
- TEE Verification: All LLM inferences are verified using hardware-attested Trusted Execution Environments
- Provable Prompt Usage: Cryptographically prove which prompts were used for any inference, enabling transparent verification of agent actions and decision-making
- On-Chain TEE Registry: TEE nodes are registered and verified on-chain via blockchain consensus, eliminating any single point of trust
- Payment-Gated Access: Secure, cryptographically-verified payment before inference execution via x402
- Multi-Chain Payments: Payment settlement supported on both OpenGradient network and Base
- Universal Access: Standard HTTP/REST APIs accessible from any language via x402, or simplified Python integration via SDK
- Low Latency: Off-chain execution with on-chain payment and proof settlement
How It Works
Both the Python SDK and x402 Gateway use the same underlying payment-gated TEE inference flow. The SDK abstracts this complexity, while x402 gives you direct control over each step.
TIP
You can read more on the x402 standard here.
1. TEE Node Registration
Before serving any inference requests, TEE nodes must be registered and verified on-chain through the TEE Registry. This ensures that every node a client connects to is running approved code inside attested hardware.
2. Initial Request
The client makes an HTTP request to the LLM inference endpoint:
POST /v1/chat/completions HTTP/1.1
Host: llmogevm.opengradient.ai
Content-Type: application/json
{
"model": "openai/gpt-4o",
"messages": [{"role": "user", "content": "Explain quantum computing in simple terms"}],
"max_tokens": 200,
"temperature": 0.7
}3. Payment Requirement Response
The server responds with a 402 PAYMENT-REQUIRED status and payment details:
HTTP/1.1 402 Payment Required
X-PAYMENT-REQUIRED: {
"amount": "0.001",
"currency": "OUSDC",
"chain_id": 10740,
"payment_id": "0x1234...",
"expires_at": "2024-01-15T10:30:00Z"
}NOTE
The HTTP response format shown in these examples is illustrative. Actual responses from the API may differ in structure or field names. Always refer to the actual API responses when implementing integrations.
4. Payment Creation
The client creates a payment payload and cryptographically signs it:
// Example payment payload creation
const paymentPayload = {
payment_id: "0x1234...",
amount: "0.001",
currency: "OUSDC",
chain_id: 10740,
timestamp: Date.now(),
nonce: generateNonce()
};
// Sign the payment payload
const signature = await wallet.signMessage(
JSON.stringify(paymentPayload)
);5. Payment Submission
The client resubmits the request with the payment signature:
POST /v1/chat/completions HTTP/1.1
Host: llmogevm.opengradient.ai
Content-Type: application/json
X-PAYMENT: {
"payload": {...},
"signature": "0xabcd...",
"address": "0x742d35Cc6634C0532925a3b844Bc9e7595f0bEb"
}
{
"model": "openai/gpt-4o",
"messages": [{"role": "user", "content": "Explain quantum computing in simple terms"}],
"max_tokens": 200,
"temperature": 0.7
}6. Payment Verification
The server (or optional Facilitator) verifies the payment signature. The LLM server at https://llmogevm.opengradient.ai handles payment verification internally, which may involve interaction with the facilitator contract at address 0x339c7de83d1a62edafbaac186382ee76584d294f.
7. Inference Execution with TEE Verification
Once payment is verified, the server executes the LLM inference on OpenGradient's decentralized network using TEE nodes. The inference is routed through TEE nodes to third-party LLM APIs, and the results are returned with TEE attestation:
HTTP/1.1 200 OK
Content-Type: application/json
X-PAYMENT-RESPONSE: {
"payment_id": "0x1234...",
"tx_hash": "0x5678...",
"settled": true
}
{
"model": "openai/gpt-4o",
"completion": "Quantum computing is a revolutionary computing paradigm...",
"finish_reason": "stop",
"usage": {
"prompt_tokens": 12,
"completion_tokens": 187,
"total_tokens": 199
},
"verification": {
"method": "TEE",
"proof": "0x9abc...",
"verified_by": "opengradient-network"
}
}NOTE
The HTTP response format shown in these examples is illustrative. Actual responses from the API may differ in structure or field names. Always refer to the actual API responses when implementing integrations.
8. Payment Settlement
After inference execution, the payment is settled on-chain. Payments can be settled on either the OpenGradient network or Base, giving users flexibility in how they pay for inference. The LLM server at llmogevm.opengradient.ai handles payment settlement internally, which may involve interaction with the facilitator contract to submit the transaction to the blockchain.
9. LLM Settlement (Proof Verification)
After the inference is executed and the TEE attestation is generated, the proof of TEE inference is posted and verified on the blockchain. This LLM settlement process ensures that:
- Proof Posting: The TEE attestation proof is posted to the blockchain as part of the settlement transaction
- On-Chain Verification: The proof is verified on-chain by validators to ensure the inference was executed correctly
- Immutable Record: The proof and inference results are permanently recorded on-chain for auditability and transparency
The settlement transaction includes:
- The TEE attestation proof
- Inference data (varies by settlement mode - see Settlement Modes below)
- Payment settlement information
- Timestamp and block information
Once the proof is posted and verified on-chain, the inference execution is considered fully settled and verified. This provides cryptographic guarantees that the LLM inference was executed correctly within the trusted execution environment.
TIP
You can find and verify settlement transactions on the OpenGradient block explorer.
Settlement Modes
Clients can choose from three modes of settlement, each offering different levels of on-chain data visibility:
SETTLE_INDIVIDUAL: Includes only input/output hashes for individual inference. This is the most gas-efficient option, storing minimal data on-chain while still providing cryptographic proof of execution. Input hashes enable verification of which prompts were used, allowing you to verify the full prompt data against the hash when needed.SETTLE_BATCH: Batch hashes for multiple inferences. Useful for applications that need to settle multiple inferences in a single transaction, reducing gas costs per inference. LikeSETTLE_INDIVIDUAL, batch hashes provide cryptographic proof of prompt usage for all inferences in the batch.SETTLE_INDIVIDUAL_WITH_METADATA: Includes full model information, complete input and output data, and all inference metadata. This mode provides full visibility on block explorers and is useful for publicly traceable and verifiable execution.
| Mode | On-Chain Data | Signature Verified | Best For |
|---|---|---|---|
SETTLE_INDIVIDUAL | Input/output hashes | Yes | Agent reasoning that handles money or makes business decisions |
SETTLE_BATCH | Batch hashes | No | Secure inference for chatbots and high-throughput applications |
SETTLE_INDIVIDUAL_WITH_METADATA | Full input, output, and metadata | Yes | Publicly traceable and verifiable execution (e.g. auditable DeFi agents) |
TEE Verification & Trust Model
All LLM inference on OpenGradient uses TEE (Trusted Execution Environments) for verification. TEE nodes route LLM requests to third-party LLM APIs (like OpenAI, Anthropic, etc.) while providing hardware-attested cryptographic proof that the inference was executed correctly and that specific prompts were used.
NOTE
TEE verification is the standard and only verification method for LLM inference on OpenGradient (both SDK and x402 Gateway). For ML execution with multiple verification options (ZKML, TEE, Vanilla), see ML Execution.
How TEE Verification Works
- LLM requests are routed through TEE nodes to third-party LLM providers
- TEE nodes provide hardware-attested verification of the inference execution
- The attestation proves that the inference was executed correctly within the trusted environment
- All inferences are verified by OpenGradient's decentralized network using TEE attestation
TEE provides strong security guarantees with negligible performance overhead, making it ideal for LLM inference.
TEE Registry
Every TEE node on OpenGradient must be registered on-chain before it can serve inference requests. The TEE Registry is a smart contract that implements the ITEERegistry.sol interface, deployed on the OpenGradient blockchain. This means every registration and verification operation is executed by all validators as part of blockchain consensus. This eliminates any single point of trust — no single party can fraudulently register a TEE or tamper with stored certificates and keys, because every validator independently executes the same verification logic and must agree on the result.
This registration process cryptographically binds the enclave's identity to the blockchain, ensuring that users can independently verify they are communicating with a legitimate, approved enclave.
Registration Parameters
TEE nodes are registered via the registerTEEWithAttestation method on the TEE Registry contract with the following parameters:
| Parameter | Description |
|---|---|
attestation | Raw AWS Nitro attestation document |
signingPublicKey | RSA public key used by the enclave to sign settlement transactions |
tlsCertificate | TLS certificate generated inside the enclave, used for HTTPS connections |
paymentAddress | Address where the TEE receives payments |
endpoint | IP or DNS endpoint of the enclave |
teeType | TEE type identifier |
Verification Checks
Registration only succeeds after the contract performs the following checks:
- Caller authorization — Only admin accounts can register TEEs.
- TEE type validity — The specified TEE type must exist and be active.
- Attestation authenticity — The AWS Nitro attestation document is verified against the AWS root certificate, proving it was issued by genuine Nitro hardware.
- Approved code verification — PCR values (PCR0, PCR1, PCR2) are extracted from the attestation and compared against the list of approved PCR hashes stored on-chain. The list of approved PCR hashes is managed on-chain through an approval process, ensuring transparency over which enclave code is permitted to run. This proves the enclave is running exact approved code.
- TLS certificate binding — The contract computes
SHA256(TLS certificate public key)and verifies it matches the hash embedded in the attestation'suser_datafield. This proves the TLS certificate was generated inside this specific enclave instance. - Signing key binding — The contract computes
SHA256(signing public key)and verifies it matches the hash in the attestation'suser_datafield. This proves the signing key genuinely originated from this enclave.
On-Chain Registry Data
Once all verifications pass, the following data is stored on-chain:
- TEE ID (derived from the signing public key)
- Owner and payment address
- Enclave endpoint
- TLS certificate — users download this to establish secure HTTPS connections
- Signing public key — the blockchain uses this to verify settlement transaction signatures
- PCR hash, TEE type, and registration timestamps
Secure Connection Establishment
When a user connects to a TEE node, trust is established through the on-chain registry rather than through traditional certificate authorities:
- Certificate retrieval — The user downloads the TLS certificate for the target TEE from the blockchain.
- Connection establishment — The user initiates an HTTPS connection to the TEE's endpoint using the on-chain certificate as the trusted root.
- Cryptographic guarantee — Because the registry verified that the TLS certificate's public key hash matches the attestation's
user_data, the user has cryptographic proof that the TLS endpoint is served by the attested enclave — not a man-in-the-middle.
This means users do not need to trust any external certificate authority. The chain of trust flows directly from AWS Nitro hardware attestation, through the on-chain registry, to the TLS connection.
Settlement & Response Verification
After inference is executed, the facilitator posts a settlement transaction to the blockchain, creating an immutable on-chain record for every inference. This record is privacy-preserving — only hashes of the input and output are stored on-chain by default (unless SETTLE_INDIVIDUAL_WITH_METADATA mode is used).
As part of settlement, the blockchain verifies the TEE node's signature on the inference response by checking it against the signing public key registered in the TEE Registry. This ensures the response genuinely came from an attested enclave and was not tampered with. Note that in batch settlement mode, individual inference signatures are not currently verified on-chain, though all other guarantees — including TEE registration, attestation, and secure connections — still apply.
Security Guarantees
After a TEE passes registration, the following guarantees hold:
- Code integrity — The enclave is running approved code, verified via PCR values from the hardware attestation against the on-chain list of approved PCR hashes.
- Key authenticity — Both the TLS certificate and the signing key provably originated from within that specific enclave instance, verified via hash binding in the attestation's
user_data. - Connection security — Users can establish end-to-end encrypted connections to the enclave using the on-chain TLS certificate, with no reliance on external certificate authorities.
- Settlement integrity — Every inference produces an immutable on-chain record. The blockchain verifies that inference responses were signed by a registered, attested enclave using the on-chain signing public key before accepting the settlement.
- Consensus-backed verification — All registry and settlement operations are executed by every validator as part of blockchain consensus, so no single party can subvert the verification process.
Facilitators
Facilitators are optional services that handle payment verification and settlement complexity. OpenGradient provides a facilitator service and endpoint (though others can run facilitator services too). Facilitators provide:
- Payment Verification: Cryptographic verification of payment signatures
- Settlement Management: On-chain transaction submission and confirmation
- Payment Method Abstraction: Support for multiple payment methods (stablecoins, crypto, fiat)
- Rate Limiting & Quotas: Usage tracking and rate limiting
- Receipt Generation: Transaction receipts and audit trails
OpenGradient Facilitator:
- Endpoint:
llmogevm.opengradient.ai(LLM server endpoint that handles facilitator interactions) - Facilitator Address:
0x339c7de83d1a62edafbaac186382ee76584d294f
When you send requests to llmogevm.opengradient.ai, the LLM server handles payment verification and settlement internally, interacting with the facilitator contract as needed.
NOTE
Facilitators are optional. Servers can handle payment verification and settlement directly when accepting stablecoins or crypto payments. OpenGradient provides a facilitator contract at address 0x339c7de83d1a62edafbaac186382ee76584d294f, but others can also deploy and use their own facilitator contracts.
Supported Models
OpenGradient's TEE LLM infrastructure supports the following models (accessible via both SDK and x402):
openai/gpt-4.1openai/gpt-4oanthropic/claude-4.0-sonnetanthropic/claude-3.5-haikux-ai/grok-3-betax-ai/grok-3-mini-betax-ai/grok-4-1-fast-non-reasoninggoogle/gemini-2.5-flash-previewgoogle/gemini-2.5-pro-preview
These models are routed through TEE nodes to third-party LLM APIs. For more information on TEE LLMs, see TEE LLMs.
Integration Example
The OpenGradient Python SDK provides high-level abstractions that handle the x402 payment flow automatically:
import opengradient as og
# Initialize SDK
client = og.Client(
private_key="<private_key>",
email=None,
password=None
)
# Run LLM inference via x402
result = client.llm.completion(
model=og.TEE_LLM.GPT_4O,
prompt="Explain quantum computing in simple terms",
max_tokens=200,
temperature=0.7
)
print("Completion:", result.completion_output)
print("Payment hash:", result.payment_hash)TIP
For more details on using the Python SDK for LLM inference, see the LLM SDK Guide. For direct HTTP integration using the x402 protocol in TypeScript, Go, and other languages, see the x402 Gateway documentation.
Use Cases
OpenGradient's TEE LLM infrastructure is ideal for:
- LLM-as-a-Service: Building private and verifiable LLM inference services
- AI Agents with Provable Actions: Building autonomous agents where you can cryptographically prove which prompt was used to take a specific action, enabling full transparency and auditability for agent decisions
- Resolution and Decision Verification: Verifying that resolutions or decisions used the correct prompt with accurate data inputs, ensuring fair and transparent outcomes with on-chain proof
- Web Applications: Integrating AI capabilities into web apps via REST APIs
- Microservices: Adding AI inference to existing microservice architectures
- Content Generation: Building content generation tools and applications
- Chat Applications: Creating chat interfaces with verified LLM backends
- API Gateways: Providing AI inference through API gateways and proxies
Next Steps
- Learn about ML Execution for PIPE-based ML model inference
- Explore Proof Settlement and how inference proofs are verified
- Check out the Python SDK for LLM inference
- Browse available models in the Model Hub
