Dell Technologies Inc.

09/06/2024 | Press release | Distributed by Public on 09/06/2024 09:22

Making Sense of AI PCs

In the last few months, the world has been inundated with news and marketing with the introduction of "AI PCs." There's no denying the buzz and excitement around these new AI PCs. Yet, finding clean-cut and actionable guidance on how to truly harness their benefits as a customer can feel like looking for a needle in a haystack. It's time we address this information gap and empower users to make the most out of this groundbreaking technology.

Comprehensive Guide

At Dell Technologies, we want to provide a comprehensive guide, designed to bridge the understanding gap in the realm of AI PCs, the performance of AI acceleration hardware like neural processing units (NPUs) and graphical processing units GPUs and the emerging software ecosystem that's taking advantage of them.

The simple truth is that all PCs can process AI features, but the introduction of specialized AI processing circuits delivers incremental performance and efficiency beyond what older CPUs could deliver. This means they can do complex AI tasks quicker and use less energy doing it. This is a big step forward in PC technology, pushing the limits of what PCs can do and setting the stage for even better AI applications in the future.

Plus, independent software vendors (ISVs) are rapidly introducing AI-based features and functionality to existing software, while also creating innovative software that leverages the unique capabilities of generative AI (GenAI). For customers to get the maximum benefit of this new software and hardware, it's important to understand if these new software features are processed locally on your PC or in the cloud. This understanding will ensure they're harnessing the full potential of their technology investments.

Accelerated AI Features

An illustrative example of this is Microsoft Copilot. Microsoft Copilot and its AI capabilities are currently processed in Microsoft's cloud, so any PC can take advantage of its productivity and time-saving features. Contrast this with Copilot+, where Microsoft is delivering unique, incremental AI features that will exclusively process locally on a Copilot+ AI PC, which amongst other things, is defined by a more powerful NPU. More on that later.

Keep in mind that ISVs have been pursuing locally accelerated AI features years prior to the introduction of AI PCs that feature NPUs. When NVIDIA introduced RTX GPUs in 2018, the company included dedicated AI acceleration circuitry called Tensor Cores. Graphics-specific ISV applications from games to professional video, 3D animation, CAD and design software all began to experiment adding GPU-processed AI features as NVIDIA RTX GPUs became popular across these markets.

For data scientists who wanted to get started with machine learning and GenAI applications, AI workstations with RTX GPUs quickly became the ideal sandbox environment, allowing experimentation with private data behind their corporate firewall and realizing better cost predictability than virtual compute environments in the cloud where the meter is always running.

All these GPU-driven AI use cases, mostly with workstation users leveraging professional NVIDIA RTX graphics, tend to favor performance first with less regard to energy efficiency. With their energy efficient AI processing, NPUs bring a new attribute to the market for working with AI features.

Regardless of the processing domain-NPU, GPU or cloud-ISVs must do the hard work of coding to support any or all of them for customers to realize a benefit. Some features may only support the NPU, some may only support the GPU and some are only available in the cloud. Understanding the ISV applications you use every day and how they will leverage your AI processing hardware is important to getting the best experience.

AI acceleration hardware is defined by a few important attributes that determine processing performance, suitability for certain workflows and energy efficiency.

Neural Processing Units

Let's start with NPUs. NPUs are a relatively recent introduction to the AI processing market and commonly take the form of a portion of a PC CPU's circuitry. The latest CPUs from Intel and Qualcomm feature integrated NPUs- so they are part of the processor. This circuitry favors the use of AI features, often called AI inferencing. The technology behind AI inferencing is primarily based on integer math. NPUs are excellent at the integer math necessary for AI inferencing. They also have an added benefit of performing inferencing with very low energy consumption, which makes them ideal for performing AI on laptops where battery life is important for mobility. While NPUs are commonly found as circuitry within latest generation CPUs, there are also discrete NPUs coming into the marketplace taking the form factor of M.2 or PCIe add-in cards and generally serve a similar function to accelerate AI inferencing.

With NPUs' recent introduction to the market, ISVs are just beginning to release software updates or versions with AI features supporting them. There are already exciting new capabilities NPUs enable, and the number of ISV applications and features is expected to grow rapidly.

NVIDIA Discrete and Integrated Graphics Cards

NVIDIA RTX GPUs are available both as a discrete chip in laptops and PCIe add-in cards for PCs and workstations. They provide a greater range of AI performance and additional use case functionality, though they don't deliver the energy efficiency of NPUs. Metrics on AI performance of NPUs and GPUs will be provided later in this post, but with a range of offerings and the ability to add multiple cards to desktop, tower and rack workstations, GPUs provide higher levels of scalable AI processing performance for advanced workflows as compared to NPUs.

NVIDIA RTX GPUs have an added benefit in that they cannot only be used for inferencing (with exceptional integer math performance) but also are suitable for training and development of GenAI large language models (LLMs). This results from their acceleration of floating-point calculations and extensive support in the tool chains and libraries commonly used by data scientists and AI software developers.

Making it Real for Your Business

AI performance is commonly measured in TOPS or trillions of operations per second. TOPS is a measurement of the potential peak AI inferencing performance based on the architecture and frequency of the processor. This measure shouldn't be confused with TFLOPs, which represents the ability of a computer system to perform one trillionfloating-point calculations per second.

Here's a chart that illustrates the relative TOPS across a range of AI focused computing devices:

This chart shows the wide range of AI inferencing scalability across Dell's AI PCs and AI workstations. It also illustrates that desktop and tower AI workstations can scale inferencing power even further by adding multiple RTX GPUs. A light blue overlay has been added to indicate which AI workstation models are ideally configured for AI development and training workflows. Keep in mind that while TOPS serves as a relative measure of performance, actual performance will be defined by the specific application operating in that environment. Likewise, the specific application or AI feature must support the specific processing domain to take advantage of the hardware capability. As ISVs continue to evolve their applications, it may be possible for a single application to route AI processing across all available AI hardware in systems that have a CPU, NPU and RTX GPU for maximum performance.

TOPS is not the only important attribute in handling AI. Memory is also important, especially GenAI LLMs. Depending on how the LLMs are handled, they can demand large amounts of available memory. With integrated NPUs, like those in the Intel Core Ultra and Qualcomm Snapdragon processors, they utilize a portion of system RAM memory. With this understanding, it's a good idea to purchase the maximum memory possible RAM configuration you can afford in an AI PC, as it will serve not just the AI processing discussed here but also general computing, graphics tasks and multitasking across applications, for which it is key.

Discrete NVIDIA RTX GPUs contain dedicated memory for each specific model with some variation in both TOPS performance and memory amounts between GPUs for mobile and fixed AI workstations. With VRAM memory capacities up to 48GB, such as with the RTX 6000 Ada, and the ability to support up to 4 GPUs in the Precision 7960 Tower, resulting in 192GB VRAM, AI workstations can scale for the most advanced inferencing workflows and provide a high-performance AI model development and training sandbox for customers who may not be ready for the even greater scalability in the Dell PowerEdge GPU AI server range. RTX GPU VRAM works in a similar way to system RAM with the NPU-it is shared across GPU accelerated compute, graphics and AI processing, and application multitasking will put even more demands upon it. If you're a regular applications multitasker, using apps with a lot of GPU acceleration, you should aim to buy AI workstations with the largest GPU (and VRAM) within your budget.

A little information can go a long way to help understand and unwrap the capabilities of AI PCs and AI workstations. In this era, where AI features are rapidly proliferating across every software application-whether commercial packaged software or in-house custom-developed tools-you can do more than participate in the time-saving efficiencies and the ability to generate all types of creative content. You can maximize these experiences by optimizing the configuration of your AI PCs and AI workstations for maximum benefit.