tiny ML Summit 2021 tiny Talks: Compute-in-Memory Hardware Accelerator for Always-On TinyML

tiny ML Summit 2021 https://www.tinyml.org/event/summit-2021
tiny Talks
Compute-in-Memory Hardware Accelerator for Always-On TinyML
Sameer WADHWA, Senior Director, Qualcomm

Always-ON tiny-ML use-cases rely on an efficient hardware accelerator to maximize battery run-time while enabling increasingly complex models.
The energy efficiency limitations of Von-Neumann architectures while executing memory bandwidth intensive compute use-cases presented by DNN are well understood. It has also been shown that a large subset of DNN models can function with little or no accuracy degradation down to 8-bit or even lower quantization levels for activations and weights.
Compute-In-Memory (CIM) is an active research area in academia and industry to achieve a significant compute energy efficiency improvement by reducing the memory bandwidth requirements when executing DNN models and taking advantage of analog compute to improve MAC computation efficiency.
This work details a CIM based stand-alone DNN hardware accelerator chip that is particularly well-suited to executing always-ON tiny-ML models. It supports convolution, fully-connected, pool, Relu, Sigmoid, Tanh layers with 1/2/4/8-bit quantized activations and weights.
Multiple CIM cores on the chip can operate in parallel to support always-ON voice keyword detection and human-detect computer-vision models while consuming very low power.
The chip comes with a tool flow to support quantizing, training, compiling off-the-shelf models to efficiently map them on the hardware. Both voice UI and CV use cases are used to demonstrate the chip’s low power performance.


Leave a Reply

Your email address will not be published.