Huawei Technologies Co. Ltd.

26/07/2024 | Press release | Distributed by Public on 26/07/2024 11:04

AI-Ready Data Infrastructure: The Answer to Large Models

There is no AI without sufficient data. Today's data infrastructure is the fuel for large AI models and the engine for end-to-end AI services, but this is only possible when iterated with large volumes of high-quality data.

AI looks to disrupt conventional data infrastructure by enhancing performance, reliability and scalability, largely in four major areas: data asset management, cluster utilization, data consistency, and data resilience. AI transformation strategies aim to call upon a broad network of unstructured data, in which enterprises that are data-ready are best positioned to grow with AI.

So the question is, how can we ensure AI-ready data infrastructure?


Designing data infrastructure on large models

AI is pushing the boundaries of data storage software and hardware systems. Designed for AI applications and services, AI-ready infrastructure should be equipped with the following features:

1. Large-scale ingestion and preprocessing

In many enterprises, data is scattered across multiple data centers or storage devices within the same facility, leading to management challenges. A lack of a unified view can be a significant bottleneck in the AI training process. However, a unified namespace can efficiently manage, schedule, and share data across regions, making data visible, manageable, and available. This leads us to the AI data lake solution, which streamlines data preprocessing and supports multiple protocols to provide high-quality datasets for AI training.

2. High performance and strong consistency

Training dataset loading and checkpointing are pivotal for improving computing power utilization and making storage persistent, with high performance being a critical requirement for both processes. Moreover, maintaining strong consistency across the cluster can significantly enhance availability and stability, which is essential for seamless operation at scale. For large-scale AI clusters, high-performance cluster file storage with strong consistency should be leveraged to deploy a real-time data sharing platform, streamlining the entire AI process.

3. Superb resilience

It is estimated that a single day of service suspension for an intelligent computing center could result in a staggering direct loss of nearly 2 million Chinese yuan (around US$276,000). With valuable data assets such as high-cost training data and checkpoints at stake, device-level resilience becomes paramount to ensuring high availability and preventing data loss or damage caused by physical faults. Data infrastructure must be able to support extreme cross-site uptime demands, targeting 6 nines or even 7 nines availability.

However, physical protection alone is insufficient to address logical faults and emerging threats. The integration of AI has given rise to rapid ransomware variant iterations and new resilience vulnerabilities, necessitating a multi-layered defense strategy. Intrinsic resilience forms the last line of defense for data by embedding protection throughout the enterprise's infrastructure, covering storage system, data disaster recovery and backup, ransomware protection, as well as comprehensive data management.


Embracing the AI transformation in enterprise services

Generative AI has sparked a watershed moment for enterprise-grade services. Demonstrating immense potential across diverse fields, AI solutions are converting risks and challenges into opportunities, offering full-stack AI platforms that support any product, framework, and pipeline where GenAI is deployed. This is the future, and the future is exactly where Huawei is headed.

Download Huawei's AI-Ready Data Infrastructure Reference Architecture White Paper.

Disclaimer: Any views and/or opinions expressed in this post by individual authors or contributors are their personal views and/or opinions and do not necessarily reflect the views and/or opinions of Huawei Technologies.

Share this: