Post by SATTELO

1,869 followers

Same cold coolant to all 16 racks. So why were the far-end racks running 14 °C hotter — and silently throttling? 👇 𝗧𝗵𝗲 𝘀𝗲𝘁𝘂𝗽 We took a real retrofit question — drop 16× NVIDIA H100 racks (560 kW) into an existing 5 MW air-cooled colocation hall — and built the entire liquid-cooling chain as a dynamic 1D model in Modelon Impact: GPU cold plate → CDU → coolant manifold → 16 racks. We validated it component by component. Everything checked out… until we looked at the manifold. ⚠️ 𝗧𝗵𝗲 𝗵𝗶𝗱𝗱𝗲𝗻 𝗽𝗿𝗼𝗯𝗹𝗲𝗺: 𝗳𝗹𝗼𝘄 𝗶𝗺𝗯𝗮𝗹𝗮𝗻𝗰𝗲 With a plain manifold, the racks furthest from the CDU saw 3.3× less coolant flow than the near racks (0.41 vs 1.34 kg/s) — simply because they sit at the end of a longer pipe run. The far-end GPUs ran ~14 °C hotter and crossed into thermal throttling. On a spec sheet, every rack gets "the same supply." In reality, a third of the fleet was being starved. ✅ 𝗧𝗵𝗲 𝗳𝗶𝘅: 𝗯𝗮𝗹𝗮𝗻𝗰𝗶𝗻𝗴 𝘃𝗮𝗹𝘃𝗲𝘀 We re-balanced the loop in simulation: flow equalised to within 0.3 % across all 16 racks, every cold plate within ~1 °C, throttling gone. No hardware trial, no torn-up pipework — just the right valve settings, found on a screen. 𝗚𝗿𝗼𝘂𝗻𝗱𝗲𝗱 𝗶𝗻 𝗿𝗲𝗮𝗹𝗶𝘁𝘆 The inputs aren't hand-waving: per-rack flow (1.5 L/min/kW), CDU sizing (300 kW units) and PUE (1.28) all line up with current industry practice — ASHRAE thermal guidelines and real CDU specs. 𝗦𝗶𝗺𝘂𝗹𝗮𝘁𝗲 𝗯𝗲𝗳𝗼𝗿𝗲 𝘀𝘁𝗲𝗲𝗹 You can find — and fix — a problem that would quietly throttle your GPUs, before a single pipe is installed. That's the whole point. 🔭 𝗪𝗵𝗮𝘁'𝘀 𝗻𝗲𝘅𝘁 (𝗣𝗮𝗿𝘁 𝟮) This is Part 1 — "does the retrofit work, and is it balanced?" The harder questions come with scale. In Part 2 we grow the data center — more racks, more load — and watch where the cooling plant strains. Then we model the upgrade most AI operators are now weighing: moving off chilled water to warm-water cooling, where free-cooling can collapse the energy overhead and largely switch the chiller off. Same model, three scenarios, real PUE numbers. If you're retrofitting air-cooled halls for AI density, sizing CDUs, or worried about flow balance and PUE — let's talk. This is what Sattelo does. Follow along for Part 2. 🔧 #DataCenter #LiquidCooling #ThermalManagement #ModelonImpact #SystemSimulation #AIInfrastructure #PUE #DigitalTwin #CDU #DirectLiquidCooling (Illustrative engineering study, modelled and validated in Modelon Impact. GPU specs per NVIDIA datasheets.)