ML Execution
WARNING
ML execution using PIPE is currently only available on our alpha testnet. It is not yet available on the official testnet. For production LLM inference, see LLM Execution which is available on the official testnet.
OpenGradient supports PIPE (Parallelized Inference Pre-Execution Engine) for running traditional Machine Learning models with native execution. PIPE enables pre-execution inference, allowing AI models to be called directly with atomic guarantees. ML execution supports multiple verification methods: ZKML, TEE, and Vanilla verification, each offering a different tradeoff between speed, cost, and security.
TIP
For LLM execution using x402 with TEE, see LLM Execution.
Execution Overview
PIPE is our novel inference execution method that allows ML models to be natively used from applications. This is one of our key technologies that enables OpenGradient to provide seamless and highly scalable inference to all developers.
NOTE
OpenGradient currently supports models in the ONNX format.
How PIPE Works
Using PIPE, applications on our network can natively use and execute AI models without introducing any overhead or congestion. Inferences are executed in parallel, which allows us to compete with centralized infrastructure providers while also providing full decentralization and, importantly, verifiable execution, ensuring the reliability of the system.
Concretely, the steps involved in executing a transaction on OpenGradient are as follows:
- User submits transaction
- Transaction is first placed in the Inference Mempool
- The inference mempool simulates all transactions and extracts inference requests triggered by the transaction.
- Inference requests are sent to the inference network for parallel execution.
- Once all inferences are completed and results are available, the transaction is retrieved from the mempool for further execution.
- Transaction is executed with pre-computed inference results.
- Transaction is included in the next block.
Benefits of PIPE
PIPE, powered by OpenGradient's inference node architecture, is a game-changer in scalability. With PIPE, we can run inferences for hundreds or thousands of pending transactions in parallel, dramatically increasing network throughput and reducing latency for all users. These technologies enable us to horizontally scale our inference network and execution, offering a virtually limitless potential for AI model execution.
In addition, for applications that directly embed AI and ML models, we can offer low-latency and non-blocking execution of transactions at scale. Since all inferences are executed in the inference mempool, actual block building remains extremely fast. No single transaction can slow down the network due to, for example, an expensive and slow ML model inference.
Verification Methods
OpenGradient offers a range of cryptographic and cryptoeconomic security schemes for securing ML inference on our network. We allow developers to choose the most suitable method for their use case, making the right tradeoff between speed, cost, and security.
NOTE
Models are executed on our permissionless and scalable inference nodes and are verified and secured in a distributed fashion by all validators on the OpenGradient Network. Read more about it in OpenGradient Architecture.
ML execution using PIPE supports three verification methods:
- ZKML (Zero-Knowledge Machine-Learning): Cryptographic proof verification
- TEE (Trusted Execution Environments): Hardware-attested verification
- Vanilla Inference: No verification overhead
Developers can pick the most suitable method for their application and use case. Below, we compiled a table of tradeoffs and suggested use cases:
| Method | Overhead | Security | Model Compatibility | Recommendation |
|---|---|---|---|---|
| ZKML | 1000-10000x slower | Instantly verified using cryptographic proof | ML models only | Best for smaller models serving high-impact use-cases |
| TEE | Negligible overhead | Instantly verified using attestation | ML models | Best for ML models requiring strong security guarantees |
| Vanilla | No overhead | No verification | All model types | Best for Gen AI or other large models |
TIP
Even within the same transaction, users can pick different security modes for different inferences, e.g., TEE for one ML model and ZKML for another.
The decision to select the right verification method is very important. It should be carefully evaluated by the application developers, considering the risks and requirements of the use case.
ZKML Verification
ZKML (Zero-Knowledge Machine Learning) provides cryptographic proof verification for ML models. It is available for ML execution using PIPE.
Characteristics:
- Security: Instantly verified using cryptographic proof
- Overhead: 1000-10000x slower than vanilla execution
- Model Compatibility: ML models only
- Best For: Smaller models serving high-impact use-cases where cryptographic guarantees are critical
ZKML provides the strongest security guarantees but comes with significant computational overhead. It is ideal for:
- Smaller ML Models: Models that can be efficiently proven in zero-knowledge
- High-Impact Use Cases: Applications where cryptographic guarantees are critical
- On-Chain Verification: When you need instant, cryptographically-verifiable results
TEE Verification
TEE (Trusted Execution Environments) provides hardware-attested verification for ML models. It is available for ML execution using PIPE.
Characteristics:
- Security: Instantly verified using hardware attestation
- Overhead: Negligible overhead compared to vanilla execution
- Model Compatibility: ML models
- Best For: ML models requiring strong security guarantees without ZKML overhead
TEE provides strong security guarantees with minimal performance impact. It is ideal for:
- ML Models Requiring Security: Models that need strong security guarantees without ZKML overhead
- Balanced Performance: When you need security with minimal performance impact
- Hardware-Attested Execution: When hardware attestation provides sufficient security
Vanilla Inference
Vanilla inference provides no verification overhead. It is available for ML execution using PIPE.
Characteristics:
- Security: No verification
- Overhead: No overhead
- Model Compatibility: All model types
- Best For: Large models, Gen AI models, or performance-critical applications where verification overhead would be prohibitive
Vanilla inference provides no security guarantees but offers the best performance. It is ideal for:
- Large Models: Models that are too large for efficient ZKML or TEE verification
- Gen AI Models: Generative AI models where verification overhead would be prohibitive
- Performance-Critical Applications: When maximum performance is required
Choosing the Right Verification Method
When selecting a verification method for ML execution, consider:
- Security Requirements: How critical is cryptographic verification for your use case?
- Performance Constraints: Can you tolerate the overhead of ZKML, or do you need maximum performance?
- Model Size: Smaller models work well with ZKML, while larger models may require TEE or Vanilla
- Use Case Impact: High-impact use cases may justify ZKML overhead for maximum security
To configure and select the verification method for your inference, refer to our Python SDK documentation.
When to Use PIPE
PIPE is ideal for:
- Application Integration: When you need to call ML models directly from your applications
- Atomic Transactions: When inference results must be part of a verified transaction
- DeFi Applications: When building DeFi protocols that require verifiable AI execution within transactions
- Verified AI Applications: When your application requires native AI capabilities with verification
Integration Examples
Python SDK Integration
The OpenGradient Python SDK provides high-level abstractions for PIPE-based ML inference:
import opengradient as og
# Initialize SDK
client = og.Client(
private_key="<private_key>",
email="<email>",
password="<password>"
)
# Run ML inference with ZKML verification
result = client.inference.infer(
model_cid='your-model-cid',
model_input={'feature1': 1.0, 'feature2': 2.0},
inference_mode=og.InferenceMode.ZKML # or TEE, VANILLA
)
print("Result:", result.model_output)
print("Transaction hash:", result.transaction_hash)Comparison with LLM Execution
| Feature | PIPE (ML Execution) | x402 (LLM Execution) |
|---|---|---|
| Integration | Smart contracts | HTTP/REST APIs |
| Payment | Native on-chain | Flexible (on-chain or off-chain) |
| Use Case | DeFi, on-chain apps | Web apps, LLM services |
| Transaction Atomicity | Yes | No |
| Verification | ZKML, TEE, Vanilla | TEE |
| Latency | Block time + inference | Network + inference |
| Model Types | ML models | LLMs |
Next Steps
- Learn about LLM Execution for x402-based LLM inference with TEE
- Explore Proof Settlement and how inference proofs are verified
- Review the Python SDK for ML inference
- Browse available models in the Model Hub
