Imagination Technologies Group Ltd.

12/03/2024 | Press release | Distributed by Public on 12/03/2024 10:50

The Future of Technology: Generative AI in China

Welcome to the second in our "Future of Technology" series where Zack Zheng, Imagination's Director of Product Management in China explains the progress China has made in developing foundational models and their deployment at the edge, highlighting key trends, technological advancements, and the future outlook. You can watch the webinar recording by completing this form, or read our recap below for the key takeaways.

The Rise of Gen-AI in China

China's investment in Gen-AI is projected to surge with an estimated 86% CAGR over the next five years. This growth is driven by a focus on technological self-sufficiency, from applications to chips, and a strong emphasis on locally developed technology. Key areas of development include:

  • AI Chip Development: Addressing the need for powerful AI infrastructure.
  • Dataset Localisation: Supporting local languages and dialects.
  • Customised AI Applications: Tailored to specific consumer and industry needs.
  • Open-Source Community: Active participation in deep learning frameworks and large model open-source communities.

Regional Technical Trends

Chinese labs have made contributed widely to the AI landscape, particularly in open-source AI ecosystems. These include:

  • Leaderboard Performance: Chinese models are competitive with top-tier global models, excelling in specific subtasks.
  • Multimodal Large Language Models: Strong contenders in sub-domains such as Chinese language understanding.
  • On-Device LLMs: Progress in developing efficient models for edge devices, such as MiniCPM3-4B.

Foundational Benchmarks

China's foundational models have are benchmarking well, with models like Qwen2.5-72B-instruct and GLM-4-plus showing significant improvements in instruction-following, long text generation, and structured data understanding. The latest Qwen 2.5 encompasses up to 18 trillion tokens and these models have been shown to be resilient to diverse system prompts, enhancing their utility in various applications.

Multimodal Large Language Models (LLMs)

Tencent's Hunyuan-Large, with 389 billion parameters, 52 billion activation parameters and the ability to hold up to 256,000 tokens, stands out as the largest open-source transformer-based mixture-of-experts model and performs well in benchmarks for language understanding, logical reasoning, and more, outperforming many larger models.

Open-Source Contributions

China has an active open-source LLM community, with models like DeepSeek-V2, a mixture-of-experts language model running at 236 billion total parameters, well-regarded for economical training and efficient inference. These models support general conversational capabilities, robust code processing, and better alignment with human preferences.

On-Device Language Models

Several influential models below 10 billion parameters are emerging such as GLM-4-9B-Chat and MiniCPM-2B, which perform well for in Chinese-language tasks and other applications. When combined with the continuous maturation of compute hardware and software for edge devices, this enables the integration of these models into cars, robots, wearables and more. Taking smartphones as a specific edge example, many OEMs are integrating advanced AI models, some ranging up to 7 billion parameters in size, into their latest devices to enable advanced applications like image generation, text understanding and AI agents.

Challenges and Solutions in Edge Computing

Why is all this effort going into creating smaller models that run on edge devices? Deploying Gen-AI at the edge rather than via the cloud enables faster response times, greater data privacy, personalised user experiences and lower cloud inference costs. Yet with many AI models still being compute-intensive, fully realising the opportunity of AI at the edge is an ongoing engineering effort.

To achieve success, edge devices need to balance computational and memory constraints, power and energy limitations, and the heterogeneous nature of edge computing devices. In particular, AI at the edge needs to coexist with many other critical user facing and device management tasks without breaking the thermal capacity of the device. GPUs offer the programmability, flexibility, efficiency and compute performance to bring Gen-AI applications to the edge.

China's Edge in Innovation and Deployment

The technology ecosystem in China is rapidly developing successful foundational models and deploying them at the edge. With continuous innovation and a strong focus on localisation and customisation, consumers will soon see even more powerful AI capabilities coming to their devices.

Tune into the next "Future of Technology" webinar: Trends in Automotive by Rob Fisher, Senior Director of Product Management at Imagination, live on Thursday 6 November and available on-demand thereafter.

Visit our AI pages to find out more about Imagination's solutions for Generative AI.