Machine Learning

tinyML TALKS: From the Cloud to the Edge: The Future of Language Models with Mahesh Yadav of Google



It’s evident that Large Language Models (LLMs) have opened up new possibilities across various applications. However, the initial excitement has overlooked critical issues like high costs, latency, and concerns about safety and security. As self-attention-based applications gain momentum this year, these overlooked concerns are pushing the shift towards edge computing and Small Language Models (SLMs).

In this session, we’ll explore a future where LLMs function more like operating systems, with SLMs driving applications directly on devices. We’ll examine the techniques enabling this transition, from training methods like distillation and pruning to performance optimization for inference. Finally, we’ll bring it all together in a hands-on lab, running an SLM on the edge.

Mahesh has 20 years of experience in building products at Meta, Microsoft and AWS AI teams. Mahesh has worked in all layers of the AI stack from AI chips to LLM and has a deep understanding of how GenAI companies ship value to customers. His work on AI has been featured in the Nvidia GTC conference, Microsoft Build, and Meta blogs.

source

Authorization
*
*
Password generation