Create a model deployment file¶

The first step to deploy you models is to create a YAML model deployment file.

One deployment file describes a case of model deployment, each file will generate one static library (if more than one ABIs specified, there will be one static library for each). The deployment file can contain one or more models, for example, a smart camera application may contain face recognition, object recognition, and voice recognition models, which can be defined in one deployment file),

Example¶

Here is an example deployment file used by an Android demo application.

TODO: change this example file to the demo deployment file (reuse the same file) and rename to a reasonable name.

# The name of library
library_name: mobilenet
target_abis: [arm64-v8a]
embed_model_data: 1
# The build mode for model(s).
# 'code' stand for transfer model(s) into cpp code, 'proto' for model(s) in protobuf file(s).
build_type: code
linkshared: 0
# One yaml config file can contain multi models' config message.
models:
  mobilenet_v1: # model tag, which will be used in model loading and must be specific.
    platform: tensorflow
    # support local path, http:// and https://
    model_file_path: https://cnbj1.fds.api.xiaomi.com/mace/miai-models/mobilenet-v1/mobilenet-v1-1.0.pb
    model_sha256_checksum: 71b10f540ece33c49a7b51f5d4095fc9bd78ce46ebf0300487b2ee23d71294e6
    subgraphs:
      - input_tensors: input
        input_shapes: 1,224,224,3
        output_tensors: MobilenetV1/Predictions/Reshape_1
        output_shapes: 1,1001
    runtime: cpu+gpu
    limit_opencl_kernel_time: 0
    nnlib_graph_mode: 0
    obfuscate: 0
    winograd: 0
  mobilenet_v2:
    platform: tensorflow
    model_file_path: https://cnbj1.fds.api.xiaomi.com/mace/miai-models/mobilenet-v2/mobilenet-v2-1.0.pb
    model_sha256_checksum: 369f9a5f38f3c15b4311c1c84c032ce868da9f371b5f78c13d3ea3c537389bb4
    subgraphs:
      - input_tensors: input
        input_shapes: 1,224,224,3
        output_tensors: MobilenetV2/Predictions/Reshape_1
        output_shapes: 1,1001
    runtime: cpu+gpu
    limit_opencl_kernel_time: 0
    nnlib_graph_mode: 0
    obfuscate: 0
    winograd: 0

Configurations¶

library_name	library name.
target_abis	The target ABI to build, can be one or more of 'host', 'armeabi-v7a' or 'arm64-v8a'.
target_socs	[optional] build for specified socs if you just want use the model for that socs.
embed_model_data	Whether embedding model weights as the code, default to 0.
build_type	model build type, can be ['proto', 'code']. 'proto' for converting model to ProtoBuf file and 'code' for converting model to c++ code.
linkshared	[optional] Use dynamic linking for libmace library when setting to 1, or static linking when setting to 0, default to 0.
model_name	model name. should be unique if there are multiple models. LIMIT: if build_type is code, model_name will used in c++ code so that model_name must fulfill c++ name specification.
platform	The source framework, one of [tensorflow, caffe].
model_file_path	The path of the model file, can be local or remote.
model_sha256_checksum	The SHA256 checksum of the model file.
weight_file_path	[optional] The path of the model weights file, used by Caffe model.
weight_sha256_checksum	[optional] The SHA256 checksum of the weight file, used by Caffe model.
subgraphs	subgraphs key. DO NOT EDIT
input_tensors	The input tensor names (tensorflow), top name of inputs' layer (caffe). one or more strings.
output_tensors	The output tensor names (tensorflow), top name of outputs' layer (caffe). one or more strings.
input_shapes	The shapes of the input tensors, in NHWC order.
output_shapes	The shapes of the output tensors, in NHWC order.
input_ranges	The numerical range of the input tensors, default [-1, 1]. It is only for test.
validation_inputs_data	[optional] Specify Numpy validation inputs. When not provided, [-1, 1] random values will be used.
runtime	The running device, one of [cpu, gpu, dsp, cpu_gpu]. cpu_gpu contains cpu and gpu model definition so you can run the model on both cpu and gpu.
data_type	[optional] The data type used for specified runtime. [fp16_fp32, fp32_fp32] for gpu, default is fp16_fp32. [fp32] for cpu. [uint8] for dsp.
limit_opencl_kernel_time	[optional] Whether splitting the OpenCL kernel within 1 ms to keep UI responsiveness, default to 0.
nnlib_graph_mode	[optional] Control the DSP precision and performance, default to 0 usually works for most cases.
obfuscate	[optional] Whether to obfuscate the model operator name, default to 0.
winograd	[optional] Whether to enable Winograd convolution, will increase memory consumption.