11/04/2024 | Press release | Distributed by Public on 11/04/2024 18:24
The rapid evolution of artificial intelligence (AI) is transforming edge computing, and Sharad Chole, Co-founder and Chief Scientist at Expedera discusses the implications. Expedera, a neural network IP provider, focuses on neural processing units (NPUs) for edge devices, emphasizing low-power operation, optimizing bandwidth, and cost efficiency. In our latest episode of Ask the Experts, Sharad shared his insights on the challenges and opportunities of deploying AI inference workloads at the edge.
The Exponential Growth in AI Model Complexity
Sharad began by noting the exponential growth in AI model sizes, from hundreds of millions to billions and now trillions of parameters. This explosive increase poses significant challenges, especially when deploying these complex models on edge devices with limited resources.
Overcoming Challenges in Edge AI: Memory and Bandwidth
Memory and bandwidth management emerged as central themes in Sharad's talk. For edge devices to perform AI inference tasks efficiently, they need advanced memory management techniques to handle data processing without overwhelming system resources. Sharad emphasized the role of quantization techniques, which reduce the computational load of AI models, making them more suitable for edge deployment. He categorized AI applications into human task replacement, supervised agents, and tools, noting the industry is increasingly focused on supervised agents and tools for practical deployment.
The Road Ahead for AI at the Edge
Sharad concluded by outlining the critical challenges that lie ahead for AI hardware, particularly the need for efficient memory and bandwidth management for both training and inference. As AI continues to grow in complexity, so too will the demands on hardware.
For those interested in learning more about Expedera's work in advancing edge AI technology, Sharad invites readers to visit Expedera's website and connect with him on LinkedIn.
Watch the full video interview below or skip down the page to read the key takeaways.
Expert
Key Takeaways
Key Quote
One thing to point out here is why we are going towards larger models. And this is very interesting to me, it's because of scaling laws. So, there is a point at which the models start exhibiting interesting capabilities - it's not just predicting the next token based on what is similar in the corpus, but it's actually understanding your question and context, and you can ask it more complex questions.