tinyML Summit 2021 tiny Talks: Low-precision Winograd Convolution over Residue Number System



tinyML Summit 2021 https://www.tinyml.org/event/summit-2021
tinyTalks Algorithms and Tools
“Low-precision Winograd Convolution over Residue Number System”
Zhi-Gang LIU, Research Engineer, Arm

The low-precision (8 or sub-8bit) convolutional neural networks consume a fraction of memory footprint and power comparing to high-precision models running on mobile or embedded devices. The classical fast Winograd convolution algorithm requires high-precision floating-point operation and thus fails to accelerate the low-precision CNN. So, the current state-of-the-art low-precision convolution is a GEMM based approach relying on im2col or im2row transformations to convert the convolution into GEMM operation and each output demands 9 MAC operations for popular 3×3 filter, 25 ops for 5×5 filter. This work extends the Winograd algorithm to modular arithmetic and explores the optimized implementation of the fast low-precision convolution for ultra-low power machine learning (ML) at the edge. The new approach has arithmetic reduction up to 6.8x corresponding to 16×16 transformation tiles and only relies on int8 or int16 op which are well supported by commodity edge devices. We evaluated the performance of proposal with sub-8bit VGG16 and ResNet50v1 models on ImageNet dataset using Arm cortex A53 cpu and M7 mcu and observed more than 2x convolution latency reduction.

source

19 thoughts on “tinyML Summit 2021 tiny Talks: Low-precision Winograd Convolution over Residue Number System

Leave a Reply

Your email address will not be published.