Results

Huawei Technologies Co. Ltd.

16/08/2024 | Press release | Distributed by Public on 16/08/2024 10:06

Harnessing AI-Ready Data Infrastructure for Enterprise Applications

In addition to suggesting an AI-ready data infrastructure architecture and devising a solution for intelligent computing centers and cloud/Internet platforms, Huawei has also made inroads in edge training and inference. We hope to help enterprises build resilient, reliable, and open AI data infrastructure that turbocharges their mission-critical applications and propels their businesses to the next level.

Industries like banking, manufacturing, and healthcare are ideal for in-depth application of large AI models because they possess mass high-value data that can be utilized for various services. Enterprises in these industries in particular expect to integrate AI into their existing applications such as production, R&D, and financial systems to enhance efficiency and innovation.


Retrieval-Augmented Generation (RAG) has proved to be the optimal AI reconstruction solution for enterprise applications in terms of cost and effectiveness. When used in conjunction with databases, this technology reduces hallucinations of large AI models and improves result generation speed, inference precision, and interactivity.

This article outlines the demands that have arisen from the AI reconstruction process and recommends an architecture of effective AI-ready data infrastructure, including the features of such an architecture and its technology highlights.


What do the key roles require?

AI reconstruction of enterprise applications involves various roles. Some of these roles have unique demands, as listed below:

  • IT O&M personnel require compute and storage resources that can be used for multiple processes to simplify O&M management.
  • Data engineers require orchestratable data processing tool chains and open operators to improve data processing efficiency.
  • Application development engineers require flexible and easy-to-use application development platform to quickly develop and debug high-accuracy RAG applications.


What kind of architecture are we looking for?

To address the aforementioned demands of key roles and streamline the implementation of large AI models in enterprises, the solution needs to provide installation and deployment, data processing, model fine-tuning, AI application development, and O&M optimization functions.

Reference architecture of the AI solution in the edge training and inference scenario

The reference architecture of this solution has the following features:


Model convergence

The model inference-only mode is replaced by RAG, which enhances the accuracy of model results and eliminates hallucinations when used in conjunction with the knowledge repository.

Development convergence

A one-stop development framework and open operators are used for data processing and application development, thus eliminating the need for multiple tools and decentralized management.

Resource convergence

Resource scheduling, computing power pooling, and shared storage provide a unified resource pool of computing power and storage for various workloads, thus improving utilization and reliability.

Converged management

Hardware and software are centrally managed and maintained, and full-stack optimization is supported.

Storage-compute-network collaboration

Storage, compute, and network resources are collaboratively pre-optimized for training and inference, and hardware and software are pre-installed, delivering out-of-the-box usability.

What makes the reference data infrastructure powerful?

Efficient pooling of computing power

GPU resources are controllable to an accuracy of 1% of computing power or 1 MB of GPU memory. Computing power can be accessed across nodes through remote access modes, and the shared memory mechanism helps eliminate data transmission overheads between CPUs and GPUs.

Open orchestration framework

A variety of plugins are provided to facilitate development, and the development and release functions are decoupled to prevent architecture corruption and facilitate expansion. Cross-language and cross-device service standards ensure the same service is implemented in multiple languages and services are provided across multiple devices. Services are orchestrated and scheduled in a visualized manner, and service flows are automatically generated, facilitating sharing and reuse of common components and services.

High-precision modular RAG

Query requests are aligned with the semantics of the knowledge repository through request rewriting/expansion and keyword extraction. Hybrid search, recursive search, and LLM-enhanced characterization and sorting technologies are used to improve search precision. Search results are re-sorted and contexts are compressed to highlight key semantics, reduce context length, and improve RAG performance.

Conclusion

For enterprises looking to unleash the potential of their data and stay ahead of the curve in the AI era, an AI-ready data infrastructure is a must-have, and one with the above features and highlights is ideal.



Huawei is an industry leader with over 20 years of extensive investment in data infrastructure. It offers a broad range of products, solutions, and case studies to help you handle AI workloads with ease. Learn more about our award-winning OceanStor Data Storage and how to unleash the full potential of your data.

Disclaimer: Any views and/or opinions expressed in this post by individual authors or contributors are their personal views and/or opinions and do not necessarily reflect the views and/or opinions of Huawei Technologies.

Share this: