noodls browser compatibility check

The security settings of your browser are blocking the execution of scripts.

To use noodls, javascript support must be enabled. Please change your browser's security settings to enable javascript.

If you have changed your browser's security settings, you can click here.

related announcements

News

Virginia Department of Corrections

Trial Date Set for Inmate Who Escaped Custody in 2023
Vanderbilt University

Vanderbilt explores an enhanced presence in New York City
City of Toledo, OH

Toledo Secures $1,150,039 Grant for the Savage Park Revitalization Project

Finance

Oracle Corporation

09/26/2024 | Press release | Distributed by Public on 09/26/2024 07:52

Announcing General Availability of OCI Compute with AMD MI300X GPUs

We're excited to announce the general availability of Oracle Cloud Infrastructure (OCI) Compute bare metal instances with AMD Instinct MI300X GPUs, BM.GPU.MI300X.8.

As AI adoption expands to support new use cases in inference, fine-tuning, and training, we want to provide more choice to customers with our first Compute instance powered by AMD Instinct accelerators. Today, applications require larger and more complex datasets, especially in the realm of generative AI and large language models (LLMs). AI infrastructure needs three critical elements to accelerate workloads: compute performance, cluster network bandwidth, and high GPU memory capacity and bandwidth. OCI's bare metal instances provide performance without the overhead of the hypervisor. OCI Supercluster with AMD Instinct MI300X accelerators provides high-throughput, ultra-low latency RDMA cluster network architecture for up to 16,384 MI300X GPUs. With 192GB of memory capacity per accelerator, AMD Instinct MI300X can run a 66-billion parameter Hugging Face OPT transformer LLM on a single GPU.

OCI Compute with AMD Instinct MI300X

This instance type provides competitive economics. It is offered at $6 per GPU/hour with the following specifications:

Instance Name

BM.GPU.MI300X.8

Instance Type Bare metal

Price (per GPU/hour) $6.00

Number of GPUs 8 x AMD Instinct MI300X Accelerators

GPU Memory 8 x 192GB = 1.5 TB HBM3

GPU Memory Bandwidth 5.3 TB/s

CPU Intel Sapphire Rapids 2x 56c

System Memory 2TB DDR5

Storage 8x 3.84TB NVMe

Front-end Network 1 x 100G

Cluster Network 8x (1x 400G)

As we updated in June, we partnered with AMD to validate their Instinct MI300X GPUs for serving LLMs. Based on our validation, the time to first token latency was within 65 milliseconds and average latency of 1.5 seconds for a batch size of one. As the batch size increased, the hardware was able to scale linearly and generate a maximum of 3,643 tokens across concurrent 256 user requests (batches). For more details, read the blog post, Early LLM serving experience and performance results with AMD Instinct MI300X GPUs.

Get started with BM.GPU.MI300X.8

BM.GPU.MI300X.8 is generally available now in the Oracle Cloud Console. Contact your Oracle sales representative or Kyle White, VP of AI infrastructure sales. Learn more about this bare metal instance in our documentation.

Sharing and Personal Tools

Please select the service you want to use:

Back