tinyML Talks recorded April 13, 2021
“Deploying AI to Embedded Systems”
Bernhard Suhm Ph.D.
Product Manager for Machine Learning
While most AI frameworks provide high-level languages and user interfaces to train models, preparation into a format that’s suitable for embedded deployment often requires recoding. Deploying industrial applications to embedded systems raises additional challenges including:
1. Integration of the AI model within a larger system
2. Meeting hardware constraints like limited memory and power consumption
3. Ensuring ongoing model performance, even if there are changes in the environment
This presentation describes an environment that supports interactive training of AI models, their preparation for embedded deployment, and integration within industrial application, all within a single framework. After prototyping the system using a high-level language as single codebase, low-level deployable C/C++ or CUDA code is generated automatically. A system modeling and simulation environment with preconfigured blocks for many industrial applications facilitates integration of the AI model.
To fit larger AI models on hardware with limited memory and power, conversion to fixed-point arithmetic reduces footprint for machine learning models, while in deep learning, quantization is applied to the millions of parameters in deep neural networks.
To ensure ongoing model performance, retraining models from scratch with additional data requires significant memory and computational power – too much for most embedded systems. Incremental learning adjusts model parameters continuously on streaming data and thus computationally less demanding. When using code generation for deployment, model parameters need to be separated from prediction code, to avoid having to redeploy models with every update.