Exploring ML Compiler Optimizations with microTVM
Gavin UBERTI, Software Engineer, OctoML
Deep learning compilers can use optimization passes to make TinyML models run much faster. But what optimizations do they actually perform? In this talk, we’ll use Apache TVM to compile a MobileNetV1 model for Cortex-M microcontrollers. We’ll look inside its intermediate representations, and watch how they change when optimizations are applied.
We’ll see how convolution kernels are tailored for the device, how quantization parameters are folded into subsequent operators, and how layouts are rewritten on the fly.