Oracle Corporation

09/24/2024 | Press release | Distributed by Public on 09/24/2024 11:56

Behind the Scenes: Optimizing data centers with CFD

The heat is on! Imagine your data center as a lively neighborhood with a busy city street. Like how the hustle of city life generates heat from the movement of vehicles, machinery and people, the dense packing of servers and IT equipment generates significant heat. And data centers aren't only feeling the heat from their own operations. They're also facing the heat of climate change.
So, how do we help ensure our data centers stay cool and operate optimally amidst these challenges? At Oracle Cloud Infrastructure (OCI), with our trusty cooling strategies, the integration of computational fluid dynamics (CFD) analysis with data center engineering design principles plays a pivotal role in designing and implementing efficient and reliable cooling systems for data centers.
This blog post aims to delve into the role of CFD analysis in data center mechanical system design, exploring how it assists in identifying potential flaws, informing design modifications to prevent equipment failure and ensuring system integrity.
What is CFD analysis?
Returning to the data center as a neighborhood analogy, imagine CFD analysis as the city planner. CFD analysis is a fundamental tool to support the mechanical system design in data centers, just like a city planner maps out streets and buildings. It enables a detailed simulation of air flow, temperature distribution, and pressure variations within the data center environment, much like how a city planner predicts traffic patterns. By creating a virtual model of the data center, and using this simulation, CFD allows designers to visualize how air moves through the space, and they can pinpoint potential hot spots and bottlenecks within the data center. These hots spots and bottlenecks pose a risk to equipment reliability and longevity, just like a city planner identifies areas prone to traffic jams. By analyzing these hot spots, they can help ensure your cooling systems are up to the task, much like a city planner help ensure that roads and transportation systems can handle the flow of people. It provides a powerful means to visualize and quantify the effects of design choices on the thermal environment.
As data center infrastructure designers, we must figure out how to use CFD modeling to help align IT racks in a data hall while trying to deliver sufficient air flow into the aisles in front of the racks. During the days of raised access floors and perforated floor tiles, no one was doing any type of containment. Understanding how much of the cold supply air would "bypass" the racks and how much hot return air would "recirculate into the fronts of the racks were both important concerns.
We attempted some what-if scenarios using bafflers or diverters to help direct air flow to where it was needed. Over time, these bafflers and diverters grew in scope to what we now call full containment: Complete separation between the cold and hot sides of IT racks. Eventually, the forward-thinking designers and end users started to include containment in all their deployments, and this shift led to newer best practices that specifically address keeping cold air and hot air separated. All supply air enters the racks and provides productive cooling because the only possible pathway for the air flow is from computer room air handling (CRAH) unit to the cold aisle, through the IT rack, to the hot aisle, and then back to the CRAH to be cooled again. In this approach, you can reduce the total amount of air provided to the data hall because less air is lost to recirculation and bypass.
Industry design guidelines and best practices have now caught up with this concept, so if we now know exactly how to deliver air to the IT racks, and if we know to follow best practices such as containment, hot aisle and cold aisle, and clear pathways for air flow, what else can we improve? The answer is lots! CFD modeling has proven to be more than simply visualization of distribution of temperature, pressure, and velocity profiles. Optimizing air flow within a data hall has subtleties, and CFD modeling is an extremely valuable design tool for all data center applications, even when the data hall is configured in the most simplistic of layouts.
In a complex and critical data center environment, understanding and controlling the thermal environment is vital. This predictive capability is crucial for identifying potential issues before they escalate into problems that lead to inefficient cooling, equipment failure, or thermal runaway.
CFD modeling for data center optimization
CFD uses the principles of fluid mechanics and thermodynamics in a way that utilizes numerical analysis to solve and analyze problems involving fluid flow. To model a space such as a data hall, the space must be segmented into a mesh with nodes. The smaller the spacing between each node, the more computational time it takes to solve a problem. We must create a balance to determine how granular the solution must be. A coarse meshing can achieve a solution quickly, but might not be as accurate down to the scale required for a particular problem. A finer mesh provides a more accurate solution, but dramatically increases the time to solution.
We find a solution in solving thermodynamic equations for the conservation of mass, momentum, and energy for each node. For each pass through each node, you get an error or residual, and when the solution is iterated hundreds or thousands of times by feeding the output into the input of the simulation, the residual is minimized, and the solution is converged. At this level, each node is resolved for its physical properties, including temperature, humidity, velocity, and pressure.
When used this way, CFD analysis helps identify risks early in the design phase, allowing for adjustments in layout, cooling infrastructure, temperature and pressure setpoints, or airflow management strategies to mitigate potential issues. The efficacy of CFD in the design and operation of data centers is contingent on the adherence to best practices throughout the modeling process. These practices ensure that CFD simulations are not only accurate but also relevant to the real-world challenges they aim to address.
Procedural steps
Now, let's establish a model and perform the situation.
Step 1: Create a 3D model of the data hall
Creating the 3D model is like drawing up the blueprint of a neighborhood.
Start with a 3D-CAD representation of the data center. Alternatively, the CFD software can create the space and insert the passive objects that define the configuration, shape, and boundaries of the space.
Step 2: Insert active objects into the space
Inactive objects are like buildings and roads in the blueprint.
After adding them, you can insert active objects into the virtual space. These active objects represent sources of heat transfer and mass flow, such as IT racks with heat generating servers, which also include server fans that move air through the heat producing devices. These objects have boundaries and should be represented as accurately as possible with respect to size, shape, position, air flow, and so on.
Active objects can also represent the introduction of air or moisture into and out of the space. Follow physical laws, for example, the introduction of air into the modeled space by a cooling system is balanced by an exact equal amount of mass extracted from that space. Equally important is the representation of thermal loads, which should mirror the actual heat output of the equipment intended to be placed in the data hall. This part requires detailed information on the equipment specifications and operational patterns, including variations in load that can impact the thermal environment. You must accurately input the properties of materials used in the data center and the boundary conditions, such as external temperatures and airflow rates.
The adage "garbage in, garbage out" is particularly relevant in CFD modeling, where the accuracy of the output is directly related to the quality of the input data. You must validate the model inputs by verifying the accuracy of the physical and operational data used in the simulation, including cross-referencing equipment specifications, conducting measurements of actual airflow rates, and reviewing operational data for accuracy.
An aspect of the model that's often overlooked is the leakage between components in the virtual space, such as gaps between adjacent IT racks, between IT racks and containment walls, between containment panels, and between the racks and the floor on which the racks rest. This leakage, as small as it might seem, is the only possible source of recirculation and bypass air. Estimating the factor of leakage is difficult. The modeler can assume a certain percent leakage of the total air flow in a space, or, if the modeler cares to place enough detail in the model to physically include these fractions-of-an inch gaps, the modeling can be made more accurate.
The virtual representation of the data hall, its loads, and its boundary conditions might not ever be a realistic view of everyday operation. I most cases, CFD modeling represents the design condition-the worst case load with the full complement of cooling equipment running at the maximum expected condition.
The designer looks for indications that the cooling system can support the load in the peak configuration under stress conditions, such as a failure of redundant equipment. If the space configuration works under these stress conditions, it also works under lower loads. Expecting that you can address every possible loading is unresonable. The permutation of loads and air flows are nearly infinite in number.
Step 3: Run the CFD model.
When the model and its active and passive components are good representations of the design condition, the meshing and computational portion of the simulation starts. Starting a simulation with a coarse meshing to allow for an approximate solution can give a quick overview to demonstrate whether the results are helpful or realistic. You can then adjust the model, and set the meshing to a finer setting to get a more accurate representation of the model.
Use CFD modeling as part of an iterative design process, where you make design adjustments based on simulation outcomes, and the model is refined continuously until you achieve the thermal performance you want. You can use CFD modeling to simulate the impact of various rack layouts on air distribution patterns, revealing how certain configurations lead to recirculation zones and hot spots near high-density racks. You can then adjust the layout until the data center achieves a uniform temperature distribution.
Let's look at an actual case study in which CFD modeling was used to determine the proper placement of pressure sensors and see how CFD modeling guides the strategic positioning of these sensors in data center design.
Hot aisle pressure sensor placement in data center design
A best practice for data center layout is to organize the rows of IT racks so that CRAH units can deliver cold air to the cold aisles, and remove hot air from the hot aisles. The front of each rack faces the cold aisle, the back of each rack faces the hot aisle. Other best practices include maintaining containment panels so that cold air can't bypass the IT racks and enter into the hot aisle, which would waste cooling capacity, nor can hot air "recirculate" into the cold aisle, which would contaminate the cold air and raise its temperature.
One common best practices is to control the supply air flow into the data hall to maintain a slightly positive pressure between cold aisle and hot aisle. This way, only a minimal amount of air is bypassed into the hot aisle and little to no air can recirculate into the cold aisle. Following the best practices doesn't require CFD modeling. The best practices are baked into the design guidelines to deliver a system that should work.
But here, things get interesting: The placement of the hot aisle pressure sensor can't necessarily follow certain best practices. If you stick the hot aisle pressure sensors directly behind a rack, the readings might not be representative of the conditions throughout a hot aisle and throughout a data hall if the rack is a high-power, high-air flow rack.
So, we need to figure out the best location for the hot aisle pressure sensor and how many sensors are needed to be most representative of the aggregate condition throughout a data hall. The CFD model can answer that. We analyzed an entire data hall with many hot aisle-cold aisle pairs using five different models. For each model, the cold aisle pressure sensor wasn't of concern because the cold aisles are wide, and the air velocities are relatively low. For the hot aisles, the case was not the same. Figure 1 shows a simple representation of one hot aisle-cold aisle pair and doesn't show the full extent of all the sensors in multiple aisles that were modeled in aggregate. We saw the following results over the five models:
A diagram of different placements of hot aisle pressure sensors using CFD.
Figure 1: Sample representation of the use of CFD modeling as a design tool for hot aisle pressure sensor placement.
The first CFD model has pressure sensors directly behind a rack. The sensor is subject to variations in air velocity, which greatly affect the pressure reading.
The second has pressure sensors in the middle of the hot aisle. As in the first model, the sensor is still subject to too many variations from rack position to rack position, and can't provide a good, aggregate reading for the hot aisle.
The third model has pressure sensors in the middle of the hot aisle but raised out of the discharge air stream from the racks. This location provides a better reading, but many variations as a result of the quickly rising air flow can exist, which might still be subject to much turbulence as a result of being at the receiving end of two high power racks.
The fourth has pressure sensors between containment and return air plenum. At this location, the air flow within the hot aisle has stabilized, and becomes a good indicator of the average condition within the hot aisle. We find that this location is also less subject to variation and becomes a good indicator of pressure regardless of the load in each rack.
The final model has pressure sensors further into the return air pathway, in the ceiling return air plenum. Although the air flow has stabilized when it enters the ceiling space, the pressure from the hot aisle has already dissipated because it makes a turn, exits the hot aisle, and possibly mixes with the return air paths of another hot aisle. So, this location isn't as representative of the hot aisle pressure.
Option 4 showed the most stable control, delivering an appropriate air flow to the cold aisles, and maintaining a uniform, slightly positive pressure difference between cold aisles and hot aisles. But this solution isn't a "one-size-fits-all." We must accept that each configuration of the data hall and selection of IT racks and their associated loads has unique challenges and solutions. CFD is a valuable engineering tool, working together with good data center engineering best practice principles to keep your data center running smoothly. The integration of CFD insights with best practices in data center design leads to optimal solutions.
Conclusion
Maintaining optimal thermal conditions in data centers amidst the growing challenges of heat and climate change is crucial. Here are the key takeaways from our journey in data center optimization:
Computational fluid dynamics (CFD) analysis plays a crucial role in designing and optimizing data center cooling systems. By simulating airflow and temperature distribution, CFD helps identify potential hotspots and optimize the placement of critical components, ensuring efficient and reliable cooling under varying conditions.
The integration of CFD insights with best practices in data center design is essential for maintaining efficient and reliable operations, allowing us to meet the challenges of heat generation and climate change head-on.
Oracle Cloud Infrastructure (OCI) continues to push the boundaries of data center design and operation, ensuring that our facilities are prepared to meet the demands of today and the challenges of tomorrow.
This blog series highlights the new projects, challenges, and problem-solving OCI engineers are facing in the journey to deliver superior cloud products. You can find similar OCI engineering deep dives as part of Behind the Scenes with OCI Engineering series, featuring talented engineers working across Oracle Cloud Infrastructure.
For more information, see the following resources:
Computational fluid dynamics at OCI
Run computational fluid dynamics on Oracle Cloud Infrastructure quickly and easily with Oracle Quick Start