ONNX to NEF Conversion Guide

Kneron NPUs require models to be in the NEF (NPU Executable Format) binary format. This guide explains how to convert standard ONNX models into NEF.

The Conversion Pipeline

ONNX Model

Optimize

Quantize

Compile

NEF File

1. Prepare Your ONNX Model

Ensure your model uses only supported ONNX operators.

Warning: Dynamic axes are not supported. Please export your model with fixed input dimensions (e.g., 1x3x224x224).

PyTorch Export Example

dummy_input = torch.randn(1, 3, 224, 224)
torch.onnx.export(model, dummy_input, "model.onnx", 
                  opset_version=11, 
                  input_names=['input'], 
                  output_names=['output'])

2. Run the Optimizer

The onnx2onnx tool simplifies the graph and folds constants.

kneron optimize input_model.onnx -o optimized.onnx

3. Quantization (Calibration)

Kneron NPUs use INT8 precision. You need a dataset of ~100 images to calibrate the quantization range.

kneron quantize optimized.onnx --dataset ./calibration_images --out quantized.onnx

4. Compile to NEF

Finally, compile the quantized model for your specific target device (e.g., KL720).

kneron compile quantized.onnx --target KL720 --out model.nef

The Conversion Pipeline

1. Prepare Your ONNX Model

2. Run the Optimizer

3. Quantization (Calibration)

4. Compile to NEF

On this page