# NeuroML Model Inference

This guide explores how NeuroML lets you run any ML or AI model directly from your smart contract.

NeuroML is tightly integrated with our Model Hub, ensuring that every model uploaded there is instantly accessible through NeuroML. We've also made the process of uploading and using your model incredibly fast, allowing you to get to work in seconds.

OpenGradient is designed to be flexible, supporting a range of inference verification techniques, such as ZKML and TEE inference. This empowers developers to choose the most suitable methods for their use cases and requirements. To learn more about the security options we offer, go to Inference Verification.

## Model Inference Precompile

NeuroML inference is provided through a Solidity interface (called `NeuroML`

) that any smart contract can use. The inference is implemented by a custom precompile on the OpenGradient network.

TIP

The NeuroML precompile is accessible at `0x00000000000000000000000000000000000000F4`

.

The two high-level functions exposed by the `NeuroML`

interface are `runModel`

and `runLllm`

. `runLllm`

is a function specifically designed for running large language models, whereas `runModel`

is a generic method that can be used to execute any AI and ML models. We are only focusing on `runModel`

on this page. Check out LLM Inference for Large Language Models.

The `runModel`

function is defined as follows:

```
interface NeuroML {
// security modes offered for ML inference
enum ModelInferenceMode { VANILLA, ZKML, TEE }
// runs any ML model
function runModel(
ModelInferenceMode mode,
ModelInferenceRequest memory request
) external returns (ModelOutput memory);
}
```

Calling this function from your smart contract will atomically execute the requested model with the given input and return the result synchronously.

## Model Input and Output

`runModel`

takes 2 arguments:

`ModelInferenceMode`

: defines the security mode that OpenGradient should use for verifying the inference (VANILLA, ZKML, TEE)`ModelInferenceRequest`

: contains the following 2 fields:`modelId`

: the unique ID of the model from the Model Hub that will be used for inference`ModelInput input`

: defines the input for the model that will be passed in during inference. The input format must match the ONNX model's expected input types and names.

Similarly to the input, `runModel`

returns a generic `ModelOutput`

.

To support a wide range of models, both the input and output can take various shapes and forms (such as multidimensional numbers and strings). These generic input and output types are made up of a list of number tensors and/or string tensors. For example,

```
struct ModelInput {
NumberTensor[] numbers;
StringTensor[] strings;
}
```

Each tensor has a unique name that must match up the expected ONNX input metadata and type.

TIP

Inspect your ONNX model metadata to find the input and output types the model expects.

Similarly, the `ModelOutput`

is represented by number and string tensors:

```
struct ModelOutput {
NumberTensor[] numbers;
StringTensor[] strings;
bool is_simulation_result; // indicates whether the result is "real"
}
```

To read more about `is_simulation_result`

, please see Simulation Results.

NOTE

We use fixed-point representation numbers in the model input and output; see the `Number`

type for more details. E.g., `1.52`

is represented as `Number{value = 152, decimals = 2}`

.

The full definition of input and output types is as follows:

```
/**
* Can be used to represent a floating-point number or integer.
*
* eg 10 can be represented as Number(10, 0),
* and 1.5 can be represented as Number(15, 1)
*/
struct Number {
int128 value;
int128 decimals;
}
/**
* Represents a model tensor input filled with numbers.
*/
struct NumberTensor {
string name;
Number[] values;
}
/**
* Represents a model tensor input filled with strings.
*/
struct StringTensor {
string name;
string[] values;
}
/**
* Model input, made up of various tensors of numbers and/or strings.
*/
struct ModelInput {
NumberTensor[] numbers;
StringTensor[] strings;
}
/**
* Model inference request.
*/
struct ModelInferenceRequest {
string modelId;
ModelInput input;
}
/**
* Model output, made up of tensors of either numbers or strings, ordered
* as defined by the model.
*
* For example, if a model's output is: [number_tensor_1, string_tensor_1, number_tensor_2],
* you could access them like this:
*
* number_tensor_1 = output.numbers[0];
* string_tensor_1 = output.strings[0];
* number_tensor_2 = output.numbers[1];
*
*/
struct ModelOutput {
NumberTensor[] numbers;
StringTensor[] strings;
bool is_simulation_result; // indicates whether the result is real
}
```

## Example Smart Contract

The following smart contract uses `NeuroML`

to natively run a model from the Model Hub and persist its output.

```
import "opengradient-neuroml/src/OGInference.sol";
contract MlExample {
// Execute an ML model from OpenGradient's model storage, secured by ZKML
function runZkmlModel() public {
// model takes 1 number tensor as input
ModelInput memory modelInput = ModelInput(
new NumberTensor[](1),
new StringTensor[](0));
// populate tensor
Number[] memory numbers = new Number[](2);
numbers[0] = Number(7286679744720459, 17); // 0.07286679744720459
numbers[1] = Number(4486280083656311, 16); // 0.4486280083656311
// set expected tensor name
modelInput.numbers[0] = NumberTensor("input", numbers);
// execute inference
ModelOutput memory output = NEURO_ML.runModel(
NeuroML.ModelInferenceMode.ZKML,
ModelInferenceRequest(
"QmbbzDwqSxZSgkz1EbsNHp2mb67rYeUYHYWJ4wECE24S7A",
modelInput
));
// handle result
if (output.is_simulation_result == false) {
resultNumber = output.numbers[0].values[0];
} else {
resultNumber = Number(0, 0);
}
}
}
```