LLM Inference
The SDK currently supports two types of LLM inference:
llm_completionfor simple LLM completionsllm_chatfor more advanced LLM chat completions (including tool-usage)
Both inference types support two execution modes:
og.LlmInferenceMode.VANILLA: standard inference execution on OpenGradient's decentralized network, providing verifiable on-chain results without hardware attestationog.LlmInferenceMode.TEE: verified and private inference through TEE nodes that route to third-party LLM APIs (OpenAI, Gemini, Anthropic, etc.)
TEE execution provides cryptographic verification of prompts for mission-critical applications (DeFi, financial services, healthcare, etc.) and ensures privacy of personal data through hardware-attested code auditing. More information on TEE LLMs can be found here.
Both of these functions mostly mirror the OpenAI APIs, however there are some minor diferences.
def llm_completion(
model_cid,
prompt,
max_tokens=100,
temperature=0.0,
stop_sequence=None)
def llm_chat(
model_cid,
messages,
max_tokens=100,
temperature=0.0,
stop_sequence=None,
tools=[],
tool_choice=None)LLM API Reference
For full definitions and documentation on these methods, please check the API reference:
Completion Example
import opengradient as og
# initialize SDK
og.init(private_key="<private_key>", email="<email>", password="<password>")
# run LLM inference
tx_hash, response = og.llm_completion(
model_cid='meta-llama/Meta-Llama-3-8B-Instruct',
prompt="Translate the following English text to French: 'Hello, how are you?'",
max_tokens=50,
temperature=0.0
)
# print output
print("Transaction Hash:", tx_hash)
print("LLM Output:", response)Chat Example
import opengradient as og
# initialize SDK
og.init(private_key="<private_key>", email="<email>", password="<password>")
# create messages history
messages = [
{
"role": "system",
"content": "You are a helpful AI assistant.",
"name": "HAL"
},
{
"role": "user",
"content": "Hello! How are you doing? Can you repeat my name?",
}]
# run LLM inference
tx_hash, finish_reason, message = og.llm_chat(
model_cid="openai/gpt-4.1",
messages=messages
)
# print output
print("Transaction Hash:", tx_hash)
print("Finish Reason:", finish_reason)
print("LLM Output:", message)Chat Example with Tools
# Define your tools
tools = [{
"type": "function",
"function": {
"name": "get_current_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"city": {
"type":
"string",
"description":
"The city to find the weather for, e.g. 'San Francisco'"
},
"state": {
"type":
"string",
"description":
"the two-letter abbreviation for the state that the city is"
" in, e.g. 'CA' which would mean 'California'"
},
"unit": {
"type": "string",
"description": "The unit to fetch the temperature in",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["city", "state", "unit"]
},
}
}]
# Message conversation
messages = [
{
"role": "system",
"content": "You are a AI assistant that helps the user with tasks. Use tools if necessary.",
},
{
"role": "user",
"content": "Hi! How are you doing today?"
},
{
"role": "assistant",
"content": "I'm doing well! How can I help you?",
},
{
"role":
"user",
"content":
"Can you tell me what the temperate will be in Dallas, in fahrenheit?"
}]
tx_hash, finish_reason, message = og.llm_chat(model_cid=og.LLM.MISTRAL_7B_INSTRUCT_V3, messages=messages, tools=tools)
# print output
print("Transaction Hash:", tx_hash)
print("Finish Reason:", finish_reason)
print("LLM Output:", message)LLM CLI Usage
We also have explicit support for using LLMs through the completion and chat commands in the CLI.
For example, you can run a competion inference with Llama-3 using the following command:
opengradient completion --model "meta-llama/Meta-Llama-3-8B-Instruct" --prompt "hello who are you?" --max-tokens 50Or you can use files instead of text input in order to simplify your command:
opengradient chat --model "mistralai/Mistral-7B-Instruct-v0.3" --messages-file messages.json --tools-file tools.json --max-tokens 200The list of models we support can be found in the Model Hub.
To get more information on how to run LLM's using the CLI, you can run:
opengradient completion --help
opengradient chat --helpTEE LLMs
OpenGradient supports LLM inference within trusted execution environments (TEEs), enabling verified and private access to both proprietary and open-source models. TEE nodes route requests to third-party LLM APIs (such as OpenAI, Gemini, Anthropic, and others) while providing critical security and verification guarantees.
Key Benefits
Verification for Mission-Critical Applications: TEE nodes enable prompt verification, making them ideal for mission-critical applications like DeFi protocols, financial services, healthcare systems, and other sensitive systems where you need cryptographic proof of what prompts were sent to the LLM.
Privacy Protection: Personal data and sensitive information remain private. TEE nodes audit and verify code execution, ensuring that your data is processed securely without exposure to unauthorized parties.
Hardware Attestation: Built on Intel TDX with confidential compute, TEE nodes provide hardware-level attestation of code execution, giving you cryptographic guarantees that the routing and verification code runs as expected before forwarding requests to third-party LLM APIs.
Usage
To utilize TEE LLM inference, use the following flags:
inference_mode=og.LlmInferenceMode.TEEfor the python SDK--mode TEEfor the CLI.
TEE Examples
DeFi Smart Contract Analysis
import opengradient as og
# initialize SDK
og.init(private_key="<private_key>", email="<email>", password="<password>")
# Analyze a smart contract with verified prompts for audit trail
contract_code = """
function transfer(address to, uint256 amount) public {
require(balanceOf[msg.sender] >= amount, "Insufficient balance");
balanceOf[msg.sender] -= amount;
balanceOf[to] += amount;
}
"""
tx_hash, response = og.llm_completion(
model_cid='gpt-4',
prompt=f"Analyze this Solidity smart contract function for security vulnerabilities:\n\n{contract_code}",
max_tokens=500,
temperature=0.0,
inference_mode=og.LlmInferenceMode.TEE # Verified prompt for audit compliance
)
print("Transaction Hash (verifiable on-chain):", tx_hash)
print("Security Analysis:", response)Privacy-Sensitive Healthcare Chat
import opengradient as og
# initialize SDK
og.init(private_key="<private_key>", email="<email>", password="<password>")
# Chat with patient data - TEE ensures privacy and code verification
messages = [
{
"role": "system",
"content": "You are a medical assistant. Analyze patient symptoms and provide preliminary guidance."
},
{
"role": "user",
"content": "Patient: 45-year-old male, presenting with chest pain and shortness of breath. Blood pressure: 140/90. What are the potential causes?"
}
]
tx_hash, finish_reason, message = og.llm_chat(
model_cid='claude-3-opus',
messages=messages,
max_tokens=300,
temperature=0.1,
inference_mode=og.LlmInferenceMode.TEE # Patient data remains private
)
print("Transaction Hash:", tx_hash)
print("Medical Guidance:", message)Financial Risk Assessment
import opengradient as og
# initialize SDK
og.init(private_key="<private_key>", email="<email>", password="<password>")
# Assess loan application with verified audit trail
loan_data = {
"applicant_income": 75000,
"credit_score": 720,
"debt_to_income": 0.35,
"loan_amount": 250000
}
prompt = f"""Assess this loan application for approval:
Income: ${loan_data['applicant_income']}
Credit Score: {loan_data['credit_score']}
Debt-to-Income Ratio: {loan_data['debt_to_income']}
Requested Loan: ${loan_data['loan_amount']}
Provide risk assessment and recommendation."""
tx_hash, response = og.llm_completion(
model_cid='gpt-4',
prompt=prompt,
max_tokens=250,
temperature=0.0,
inference_mode=og.LlmInferenceMode.TEE # Cryptographic proof of assessment criteria
)
print("Audit Trail Hash:", tx_hash)
print("Risk Assessment:", response)CLI Usage with TEE
You can also use TEE mode from the command line:
# DeFi contract analysis with verification
opengradient completion \
--model "gpt-4" \
--prompt "Analyze this smart contract for reentrancy vulnerabilities..." \
--max-tokens 500 \
--mode TEE
# Chat with privacy guarantees
opengradient chat \
--model "claude-3-opus" \
--messages-file patient_query.json \
--max-tokens 300 \
--mode TEEThe TEE node routes your request to the third-party API while providing cryptographic verification of the prompt and ensuring your data remains private through hardware-attested code execution.
Supported Models
TEE LLMs support routing to various third-party providers including:
- OpenAI models (GPT-4, GPT-3.5, etc.)
- Google Gemini models
- Anthropic Claude models
- Open-source models like meta-llama/Llama-3.1-70B-Instruct
- More models and providers coming soon!
NOTE
This technology is cutting-edge, so access may be periodically restricted due to usage limitations.
SDK API Reference
Please refer to our API Reference for any additional details around the SDK methods.
