12/03/2024 | Press release | Distributed by Public on 12/03/2024 10:50
Welcome to the second in our "Future of Technology" series where Zack Zheng, Imagination's Director of Product Management in China explains the progress China has made in developing foundational models and their deployment at the edge, highlighting key trends, technological advancements, and the future outlook. You can watch the webinar recording by completing this form, or read our recap below for the key takeaways.
China's investment in Gen-AI is projected to surge with an estimated 86% CAGR over the next five years. This growth is driven by a focus on technological self-sufficiency, from applications to chips, and a strong emphasis on locally developed technology. Key areas of development include:
Chinese labs have made contributed widely to the AI landscape, particularly in open-source AI ecosystems. These include:
China's foundational models have are benchmarking well, with models like Qwen2.5-72B-instruct and GLM-4-plus showing significant improvements in instruction-following, long text generation, and structured data understanding. The latest Qwen 2.5 encompasses up to 18 trillion tokens and these models have been shown to be resilient to diverse system prompts, enhancing their utility in various applications.
Tencent's Hunyuan-Large, with 389 billion parameters, 52 billion activation parameters and the ability to hold up to 256,000 tokens, stands out as the largest open-source transformer-based mixture-of-experts model and performs well in benchmarks for language understanding, logical reasoning, and more, outperforming many larger models.
China has an active open-source LLM community, with models like DeepSeek-V2, a mixture-of-experts language model running at 236 billion total parameters, well-regarded for economical training and efficient inference. These models support general conversational capabilities, robust code processing, and better alignment with human preferences.
Several influential models below 10 billion parameters are emerging such as GLM-4-9B-Chat and MiniCPM-2B, which perform well for in Chinese-language tasks and other applications. When combined with the continuous maturation of compute hardware and software for edge devices, this enables the integration of these models into cars, robots, wearables and more. Taking smartphones as a specific edge example, many OEMs are integrating advanced AI models, some ranging up to 7 billion parameters in size, into their latest devices to enable advanced applications like image generation, text understanding and AI agents.
Why is all this effort going into creating smaller models that run on edge devices? Deploying Gen-AI at the edge rather than via the cloud enables faster response times, greater data privacy, personalised user experiences and lower cloud inference costs. Yet with many AI models still being compute-intensive, fully realising the opportunity of AI at the edge is an ongoing engineering effort.
To achieve success, edge devices need to balance computational and memory constraints, power and energy limitations, and the heterogeneous nature of edge computing devices. In particular, AI at the edge needs to coexist with many other critical user facing and device management tasks without breaking the thermal capacity of the device. GPUs offer the programmability, flexibility, efficiency and compute performance to bring Gen-AI applications to the edge.
The technology ecosystem in China is rapidly developing successful foundational models and deploying them at the edge. With continuous innovation and a strong focus on localisation and customisation, consumers will soon see even more powerful AI capabilities coming to their devices.
Tune into the next "Future of Technology" webinar: Trends in Automotive by Rob Fisher, Senior Director of Product Management at Imagination, live on Thursday 6 November and available on-demand thereafter.
Visit our AI pages to find out more about Imagination's solutions for Generative AI.