EMEA 2021 Keynote: The model efficiency pipeline, enabling deep learning inference at the edge



EMEA 2021 https://www.tinyml.org/event/emea-2021
Keynote
The model efficiency pipeline, enabling deep learning inference at the edge
Bert MOONS, Research Scientist, Qualcomm

Today, most deep learning and AI applications are developed on and for high-performance computing systems in the cloud. In order to make them suitable for real-time deployment on low-power edge devices and wearable platforms, they have to be specifically optimized. This talk is an overview of a model-efficiency pipeline that achieves this goal: automatically optimizing deep learning applications through Hardware-Aware Neural Architecture Search, compressing and pruning redundant layers and subsequently converting them to low-bitwidth integer representations with state-of-the-art data-free and training-based quantization tools. Finally, we take a sneak peek at what’s next in efficient deep learning at the edge: mixed-precision hardware-aware neural architecture search and conditional processing.

source

225 thoughts on “EMEA 2021 Keynote: The model efficiency pipeline, enabling deep learning inference at the edge

Leave a Reply

Your email address will not be published.