Air Conditioning FAIL

On Saturday all three air conditioning units in the server room shut down, and the place rapidly turned into an oven. Our servers put out a lot of heat, and have to be kept cool to prevent Bad Things from happening… and so when the air handlers stopped, Bad Things started to happen.

Luckily, only a couple of servers had actual hardware damage, and those didn’t have anything critical on them. Several more servers shut down ungracefully or started behaving erratically. Luckily our two biggest servers, cougar and sundown, never actually crashed, but since our main network infrastructure server did, nobody could get to cougar or sundown.

Since I live so close to campus, I got called in, but it was Paul Lambert and Dave Diemer who did most of the heavy lifting. Once the major problems were cleared away, then I could do my thing. Dave was still working on three servers until the next morning, and I was up until really late babysitting the webserver, which seemed to go catatonic every few minutes for no apparent reason. We’ll still be cleaning this up for a while.

