Skip to content

nnstreamer/hal-backend-ml-accelerator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HAL ML Accelerator Backends

This document provides an overview of the available Machine Learning (ML) accelerator backends within this project. It details their purpose, configuration, and specific custom properties that can be used to tailor their behavior.

1. Vivante Backend (ml-vivante)

  • Vendor: VeriSilicon
  • Description: This backend utilizes VeriSilicon's Neural Processing Unit (NPU) for ML inference. It requires a model file (typically .nb) and a corresponding json file or shared object (.so) file generated by Vivante's toolchain. These files contain the compiled network and necessary runtime functions.
  • Source File: src/hal-backend-ml-vivante.cc

Configuration

There are two options:

  1. model.nb + model.json
  2. model.nb + model.so
  • Model Files (prop->model_files):
    • model_files[0]: Path to the Vivante model file (e.g., my_model.nb). This file contains the neural network graph and weights.
    • (Optional) model_files[1]: Path to the Vivante shared object file (e.g., vnn_my_model.so). This library provides functions like vnn_CreateNeuralNetwork and vnn_ReleaseNeuralNetwork.

Note that if the shared object file is not provided, the backend attempts to load the model using the JSON file (json path should be specified in custom_properties)

Custom Properties (prop->custom_properties)

  • json:
    • Description: Path to the JSON file for the given model file.
    • Key: json
    • Value: Path to the JSON file.
    • Example: json:/path/to/my_model.json
    // yolo-v8m.json
    {
      "input_tensors": [
        {
          "id": 2,
          "dim_num": 4,
          "size": [416, 416, 3, 1],
          "dtype": {
            "vx_type": "VSI_NN_TYPE_UINT8",
            "qnt_type": "VSI_NN_QNT_TYPE_AFFINE_ASYMMETRIC",
            "zero_point": 0,
            "scale": 0.00390625
          }
        }
      ],
      "output_tensors": [
        {
          "id": 0,
          "size": [3549, 4, 1],
          "dim_num": 3,
          "dtype": {
            "scale": 2.325678825378418,
            "zero_point": 12,
            "qnt_type": "VSI_NN_QNT_TYPE_AFFINE_ASYMMETRIC",
            "vx_type": "VSI_NN_TYPE_UINT8"
          }
        },
        {
          "id": 1,
          "dim_num": 3,
          "size": [3549, 80, 1],
          "dtype": {
            "scale": 0.0038632985670119524,
            "zero_point": 0,
            "qnt_type": "VSI_NN_QNT_TYPE_AFFINE_ASYMMETRIC",
            "vx_type": "VSI_NN_TYPE_UINT8"
          }
        }
      ]
    }

2. SNPE Backend (ml-snpe)

  • Vendor: Qualcomm
  • Description: This backend leverages Qualcomm's Snapdragon Neural Processing Engine (SNPE) for executing ML models. It typically uses a .dlc (Deep Learning Container) file as its model format.
  • Source File: src/hal-backend-ml-snpe.cc

Configuration

  • Model Files (prop->model_files):
    • model_files[0]: Path to the SNPE model file (e.g., my_model.dlc).

Custom Properties (prop->custom_properties)

Custom properties for the SNPE backend are provided as a single string, with individual key:value pairs separated by commas. Format: key1:value1,key2:value2,key3:value3a;value3b

  • Runtime:

    • Description: Specifies the preferred SNPE runtime target for model execution.
    • Key: Runtime
    • Value:
      • CPU: Use CPU runtime.
      • GPU: Use GPU runtime.
      • DSP: Use DSP runtime.
      • NPU or AIP: Use NPU/AIP runtime (specifically maps to SNPE_RUNTIME_AIP_FIXED8_TF).
    • Example: Runtime:DSP
  • OutputTensor:

    • Description: Specifies the names of the output tensors the application wishes to retrieve. If not provided, the backend uses all default output tensors defined in the model.
    • Key: OutputTensor
    • Value: A semicolon-separated list of output tensor names. Tensor names themselves can include colons.
    • Example: OutputTensor:detection_scores:0;detection_classes:0;raw_outputs/box_encodings
  • OutputType:

    • Description: Specifies the desired data types for the output tensors. The order of types in the list should correspond to the order of output tensors (either the default order or the order specified by the OutputTensor property).
    • Key: OutputType
    • Value: A semicolon-separated list of data types.
      • FLOAT32: Output tensor data type will be 32-bit float.
      • TF8: Output tensor data type will be 8-bit quantized (typically uint8_t).
    • Example: OutputType:FLOAT32;TF8 (assuming two output tensors, the first as float32, the second as TF8)
  • InputType:

    • Description: Specifies the data types for the input tensors. The order of types in the list should correspond to the order of input tensors as defined in the model.
    • Key: InputType
    • Value: A semicolon-separated list of data types.
      • FLOAT32: Input tensor data type is 32-bit float.
      • TF8: Input tensor data type is 8-bit quantized (typically uint8_t. Efficient for raw RGB data). Using TF8 usually implies that the model has built-in quantization parameters, and the backend will expect quantized input data. If the model expects float input but TF8 is specified, it might lead to errors.
    • Example: InputType:TF8 (assuming a single input tensor of TF8 type)

Example custom_properties String for SNPE:

"Runtime:DSP,OutputTensor:my_output_tensor1;my_output_tensor2,OutputType:FLOAT32;FLOAT32,InputType:TF8"

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Contributors 2

  •  
  •