tinyML Asia 2022 Shinkook Choi: Hardware-aware model optimization in Arm Ethos-U65 NPU



Hardware-aware model optimization in Arm Ethos-U65 NPU
Shinkook CHOI, Lead Core Research, Nota Inc

As deep learning advances, edge devices and lightweight neural networks are becoming even more important. In order to reduce latency in the AI accelerator, it is important not only to reduce FLOPs but also to increase hardware efficiency. By analyzing the hardware performance of Arm Ethos-U65 NPU using Arm Virtual Hardware, we discovered that the latency of the convolution layer showed a staircase pattern varying on configuration. To utilize Arm Ethos-U65 NPU fully, the parameter of the compression methods should be set to include consideration of the latency pattern. By applying device characteristics of Arm Ethos-U65 NPU to NetsPresso which is a hardware-aware AI optimization platform, we adjusted the parameters of structured pruning and filter decomposition to fit a multiple-of-step size of the latency staircase pattern. In the image classification task, we validated the hardware-aware model compression increased FLOPs and accuracy at the same latency.

source

Authorization
*
*
Password generation