Microsoft is seeking better ways to manage powerful new AI hardware in its cloud computing operation, and focusing on immersion cooling technology used in bitcoin mining as the most promising technology for future high-density data centers.

The company is test-driving a setup in which servers are dunked in tanks of cooling fluid to manage rising heat densities. Citing the need to prepare for more powerful new chips and rising rack densities, the company is test-driving “promising” immersion technologies that are currently used in bitcoin mining operations.

“We’ve been investigating how we can achieve better cooling efficiency, and liquid cooling is what we’ve been focusing on,” said Mark Russinovich, the Chief Technology Officer of Microsoft’s Azure Cloud, said in a presentation at Microsoft Ignite. “What we’ve locked on as likely where we’re headed in our data centers is two-phase liquid cooling, and we’ve made a ton of progress down this path.”

Microsoft showed off a cooling lab where it is cooling servers inside a two-phase immersion system from Allied Control, which is owned by bitcoin mining specialist BitFury, which uses the design to support power densities of up to 250 kW per rack. Immersion can deliver exceptional power efficiency because it uses sealed tanks that don’t require the raised floors or room-level air cooling found in most commercial data centers.

That’s appealing to the Microsoft team, which likes the potential economic gains from packing more compute into a smaller amount of real estate.

“How do we pack and leverage the floor space in our data centers more efficiently?” Russinovich asked. “If we’re air cooling them, which is the way we’re cooling these data centers today, we’ve got to leave hot air aisles and cool air aisles and have huge HVAC systems that are pumping air in and out of the data center. It’s a large use of space.”

“Liquid cooling affects the whole ecosystem,” said Husam Alissa, a Principal Engineer in Microsoft’s liquid cooling lab. “When you take a look at the data center and the server and the sustainability promise that Microsoft is making, liquid cooling can help us get there faster. With liquid cooling, we can have higher density racks that could lead to smaller data center footprint, and lower their center energy consumption from the mechanical cooling perspective, and also from the server perspective, because we could reduce or remove the fans from the servers.”

Microsoft Moonshots Could Transform the Data Center

The liquid cooling research is one of many ways that Microsoft is testing new technologies that could radically change data centers. As the largest player in the market for leased data center space, Microsoft’s decisions on its core server, power and cooling technologies could have outsized influence in the data center industry. These moonshots and science experiments are in various phases of deployment, but make it clear that Microsoft is preparing to build a very different kind of future cloud as it seeks to meet aggressive carbon reduction goals.

These initiatives include:

In an era of exciting advances in data center design, Microsoft’s reach seeks to extend the frontiers of computing, with major implications for the broader industry. Thus far, these initiatives have yet to be deployed at scale. But Microsoft’s focus on liquid cooling is being driven by changes in hardware that command a response.

The rise of artificial intelligence, and the hardware that supports it, is reshaping the data center industry’s relationship with servers. New hardware for AI workloads is packing more computing power into each piece of equipment, boosting the power density – the amount of electricity used by servers and storage in a rack or cabinet – and the accompanying heat. The trend is challenging traditional practices in data center cooling, and prompting data center operators to adapt new strategies to support high-density racks.

More Server Power Requires More Cooling

In his presentation at Ignite, Russinovich noted the heat generated by powerful new hardware like the NVIDIA A100 GPU chips, which feature prominently in new Azure Cloud offerings.

“The A100 really represent kind of trend that we’ve been seeing … for more and more power consumption per server,” said Russinovich, who said general purpose chips were also featuring more cores and drawing more power.

He said Microsoft has experimented with several approaches to liquid cooling, including a cold plate design, which typically contains a tubing system filled with liquid refrigerant. Russinovich tested cold plates using a personal gaming system, but said this approach would be hard to scale.

“That is a great way to cool, but it’s got the downside that every single server has to be custom fitted for the pipes and the cold plates,” he said. “It’s not a one size fits all model.”

Microsoft Azure CTO Mark Russinovich shares a cold plate cooling design he used in a gaming system to text advanced cooling designs. Nice rig, but needs more colored lights. (Image: Microsoft)

He cited similar challenges with a single-phase immersion cooling system in which liquid coolant is introduced into the server chassis.

That’s why Microsoft is focused on two-phase immersion cooling, in which servers are immersed in a coolant fluid that boils off as the chips generate heat, removing the heat as it changes from liquid to vapor. The vapor then condenses into liquid for reuse, all without a pump. Data center designs for this “phase change” cooling were pioneered by 3M as a use case for its Novec engineered fluid, which has a low boiling point. (Flashback: See my initial coverage of this technology in 2012)

Bitcoin Tech Comes to Hyperscale

The video from Ignite showed a two-phase immersion system operating in Microsoft’s liquid cooling lab. The servers were enclosed in a system from Allied Control, which has been a pioneer in adapting extreme density cooling systems using Novec (which is not named as the coolant in the Microsoft video, although 3M-branded boxes are visible in the background). In 2014 it created a high-density immersion cooling system that allowed its client to deploy a bitcoin mining operation on the upper floors of a skyscraper in Hong Kong.

The company was then acquired by bitcoin specialist BitFury, which brought the technology to scale in containers filled with immersion tanks, which it used to create a 40-megawatt facility that l features power densities of 250kW per enclosure.

That disruptive potential is why the Open Compute Project has been working to enable wider adoption of liquid cooling, citing demand from hyperscale computing providers, as well as new applications in edge computing. Microsoft has been among the participants in the OCP liquid cooling initiatives, which followed Google’s revelation that it has shifted to liquid cooling with its latest hardware for artificial intelligence.

Microsoft signaled its growing interest in liquid cooling in a blog post last year on its OCP initiatives.

“While liquid cooling is a technology that has been used in specific use cases, such as bitcoin mining, we are not only investing in the solutions and technologies that will power new architectures, but also focusing intensely on the challenges that will come into play as we look to extend the reach of these capabilities to a hyperscale cloud,” the company wrote.

Leave a Reply

Your email address will not be published. Required fields are marked *