tiny ML Summit 2021 tiny Talks: Compute-in-Memory Hardware Accelerator for Always-On TinyML

tiny ML Summit 2021 https://www.tinyml.org/event/summit-2021
tiny Talks
Compute-in-Memory Hardware Accelerator for Always-On TinyML
Sameer WADHWA, Senior Director, Qualcomm

Always-ON tiny-ML use-cases rely on an efficient hardware accelerator to maximize battery run-time while enabling increasingly complex models.
The energy efficiency limitations of Von-Neumann architectures while executing memory bandwidth intensive compute use-cases presented by DNN are well understood. It has also been shown that a large subset of DNN models can function with little or no accuracy degradation down to 8-bit or even lower quantization levels for activations and weights.
Compute-In-Memory (CIM) is an active research area in academia and industry to achieve a significant compute energy efficiency improvement by reducing the memory bandwidth requirements when executing DNN models and taking advantage of analog compute to improve MAC computation efficiency.
This work details a CIM based stand-alone DNN hardware accelerator chip that is particularly well-suited to executing always-ON tiny-ML models. It supports convolution, fully-connected, pool, Relu, Sigmoid, Tanh layers with 1/2/4/8-bit quantized activations and weights.
Multiple CIM cores on the chip can operate in parallel to support always-ON voice keyword detection and human-detect computer-vision models while consuming very low power.
The chip comes with a tool flow to support quantizing, training, compiling off-the-shelf models to efficiently map them on the hardware. Both voice UI and CV use cases are used to demonstrate the chip’s low power performance.


168 thoughts on “tiny ML Summit 2021 tiny Talks: Compute-in-Memory Hardware Accelerator for Always-On TinyML

Leave a Reply

Your email address will not be published.