IBM - International Business Machines Corporation

11/18/2024 | News release | Distributed by Public on 11/18/2024 08:49

Rapidly unlocking geospatial insights with IBM AI chipsAI HardwareClimate and SustainabilitySemiconductors

18 Nov 2024
News
4 minute read

Rapidly unlocking geospatial insights with IBM AI chips

The University of Alabama in Huntsville is installing a new system containing IBM AIU chips for running AI models developed by IBM and NASA.

An example AIU cluster installation.

The University of Alabama in Huntsville is installing a new system containing IBM AIU chips for running AI models developed by IBM and NASA.

As AI advances at a breakneck pace, it's essential to design new hardware in tandem with the models developers are creating, to ensure hardware can get the best out of the models. Today, most AI models are trained and run on GPUs, which do well at handling the performance needs of these models - but they require substantial amounts of energy to run. New hardware specifically designed for AI workloads could lead to improvements in both energy efficiency and performance.

IBM Research took on this challenge five years ago with the opening of the IBM Research AI Hardware Center, and in 2022, it debuted the artificial intelligence unit (AIU) chip, the first complete system-on-a-chip built specifically to tackle AI models' power-hungry needs. Now, researchers studying our planet's changing climate at The University of Alabama in Huntsville are putting IBM chips born from the AIU to the test with a new computing cluster installed on campus that has the potential to make running AI models much more energy efficient.

New chips on the block

IBM and The University of Alabama in Huntsville (UAH) are collaborating to install a cluster at UAH containing IBM Spyre chips to run advanced AI models. IBM Spyre is the first AIU production accelerator born out of IBM Research, and is part of a long-term strategy of developing novel architectures and full-stack technology solutions for the emerging space of generative AI. It's part of the IBM Research AIU family, which also includes IBM AIU NorthPole and work on analog chips. A commercial version of the IBM Spyre AIU chip, the IBM Spyre Accelerator, was announced earlier this fall at the Hot Chips conference.

"The new cluster at UAH will enable researchers to tune and test not only new applications for these AI foundation models, but also the performance of the IBM Spyre accelerators," said Jeff Burns, director of the IBM Research AI Hardware Center. "We're looking forward to seeing how the chips can speed up AI workflows in this exciting field and ultimately lead to more efficient model deployment."

At UAH, IBM will install an integrated cluster of IBM AIUs and GPUs. Researchers will primarily use the cluster for deploying the Prithvi geospatial and weather and climate models from IBM and NASA, testing the operational workloads and running throughput-focused workloads. The cluster will run on Red Hat OpenShift AI, demonstrating the value of a full-stack - hardware and software - solution that leverages heterogeneous accelerators.

"Red Hat OpenShift AI is designed to deliver a consistent and scalable platform that accelerates AI model deployment and testing across diverse hardware environments," said Steven Huels, vice president, AI engineering, Red Hat. "In this new cluster, Red Hat OpenShift AI provides researchers with the flexibility to manage complex AI workloads and allows them to unite and optimize diverse computing resources for top performance. Together, Red Hat, IBM and UAH are driving innovation in fields that depend on cutting-edge AI capabilities, from climate modeling to geospatial analysis, and showcasing the potential for integrated AI hardware and software platforms."

Little chips, big impact

Researchers anticipate that the IBM Spyre's unique architecture will help reduce energy consumption for AI fine-tuning and inferencing. In preliminary tests led by IBM Research, the IBM Spyre AIU cluster running inference on the IBM-NASA geospatial foundation model could process 2.1 images per second per watt used, versus 0.6 img/sec/W on standard GPUs - more than three times as many.

When this system is analyzing 70 terabytes of incoming satellite data streamed each day, this sort of efficiency gain could make a huge difference. At the current rate, running the geospatial model on UAH's new AIU cluster could save 23 kW of power a year, the equivalent energy of 20 U.S. homes and 85 tons of carbon emissions. This includes the cards, node overheads and reduced data center cooling costs.

Once the IBM AIU cluster is up and running on campus, researchers will start by deploying workloads that will push the performance boundaries of the system, which will be used primarily for AI fine-tuning and inferencing. These experiments will help researchers validate IBM Spyre's capabilities, as well as identify ways to make the chips even more energy-efficient.

The cluster will be housed in the National Space Science Technology Center on UAH's campus.

Advancing geospatial science and the AI workforce

UAH's cluster will reside at the National Space Science Technology Center (NSSTC), where UAH faculty, NASA researchers, and atmospheric scientists work alongside one another. "The goal is to have tight collaborations - research and education all in the same building," explained Sujit Roy, a UAH computer scientist who leads the foundation model development team within NASA's Interagency Implementation and Advanced Concepts Team (IMPACT).

This installation is part of a larger effort between IBM, UAH, and NASA to apply foundation models to help us learn more about our planet. UAH researchers worked alongside with IBM and NASA to develop the open-source geospatial model and weather and climate models. Together, the team conceptualized the models, trained them on NASA satellite data, scaled, and prepared the model for release on public platforms. For the geospatial work, NASA awarded IBM and UAH a NASA Marshall Space Flight Center Group Achievement Award this past August.

The IBM Spyre cluster at UAH can now be used to test and deploy a variety of downstream applications of the foundation models, from flood detection to assessing tree canopy height or measuring gravity waves. Both the geospatial and weather and climate foundation models are available on open-source AI platform Hugging Face.

The UAH research team doesn't just include AI developers. Udaysankar Nair, a professor of Atmospheric and Earth Science who focuses on atmospheric numerical modeling, evaluates each model from a science perspective to ensure the "AI model works in such a manner that is consistent with the physics." And, Nair shared, there's the potential for the Spyre system and the models running on it to be integrated into UAH curricula in the future as well. "That is something I'm hoping would eventually come out of here too," he added.

Currently, IBM researchers are constructing and pre-populating the system for UAH. Once it's shipped it can be up and running on campus within days, ready to start exploring some of the mysteries of our planet hiding in satellite data.