Fractal Design Integra series 450W SFX power supply oopsie (WHL #63)
Someone missed posting two weeks ago and didn’t make up for it yet. That guy was me and here’s a two-part random collection on why that happened, what fun things I spent time with instead, and why hacking hardware sometimes sucks (a little). I’ll classify this one as regular post due to the power supply teardown, and the other one as a project, should be up by next Sunday.
Well, when I started gathering photos a fortnight ago to make a new post, my ZFS upgrade project (more on that in a minute) was finally going smoothly after a rocky start when creating the new zpool. Not all that interesting, but after a bit of digital housekeeping I finally had a working system to copy old data to new drives and I just ran that copy process. Takes literally hours as we’re talking about roughly 10 TB at speeds of 10-15GB per minute from a purposefully degraded source. So after everything else was taken care of for the day/weekend, I was set to make a new blog post.
That tranquillity was abruptly ended by a loud BANG. Sudden darkness, HDD and fan spindown, the smell of amps, the full oh-fuck show with the guy from photonicinduction laughing in the credits.
Thankfully, after a very quick disassembly of my UPS thanks to the external battery mod in #P11F1 it became clear that there’s no imminent fire risk due to the UPS being overloaded yet still being somewhat force-fed by the lead acid batteries.
Must have been something in the server itself, which at this point had been running for around two hours with external fans to get rid of ~350W despite having only five wimpy 80mm fans in it (all that server grade stuff isn’t gonna work in the living room). Before that, it was only hovering at ~250W as there was little load on both the hard disks and the CPU. Also, half an hour prior I was in the kitchen for some time and thought on entering the living room that something smells of warm plastic. I had been in the living room for basically all day, so that could have been adaptation to the smell (especially since I had put in a kitchen sponge near the RAID cards for urgent air flow adjustments), but I actually checked if something was wrong and couldn’t find anything except for very warm wallpaper behind the rack.
That could be bad news or very bad news, since at that point the chassis was loaded with 16 hard disks, 6 of which carry ALL of my data and 10 new ones that are not yet synced. The 2 remaining ones from the old array do not provide enough redundancy to recover any data and the 3 backup drives do only have the most important ZFSes on them, not all of them due to size restrictions. Worst case, this could have been it for 25 years of data from floppy disks, CDs, DVDs, VHS and trustable sources on the interwebs (3dfx HammerHead FX drivers, anyone?).
I’m very glad to announce that this is not the case, all hardware except for the power supply survived. So this was only bad news since I need a new SFX power supply which complicated things later on.
The (remains of the) power supply in question:
Bought used, this Fractal Design 450W Integra series power supply, made by FSP (FSP450-60GHS(85)) failed due to overheating. Now, thing is, there’s usually two separate protection methods against this type of catastrophic failure: There’s over-temperature protection OTP, and there’s the tachometer signal from the fan. OTP clearly isn’t implemented in a way that worked (or not at all, hard to tell since FSP does not advertise this model at all and Fractal Design only sold it bundled with SFX cases, so no datasheet here), and the fan speed control was, well, skimped on:
The fan had a regular 3-pin plug, extremely short, less than 5cm long. Besides saving cents on not implementing a watchguard for fan failure, look at that connection: Geez, FSP, there has to be a better way to solder such a connector to the PCB?
Fan failure by the way was the culprit here, as the “Blacknoise Industral” 8015-2000-12 fan made by Noiseblocker (a reputable company?) did not start properly below ~11V and even at 12V was far away from achieving 2000 rpm. At least one drive coil was high impedance after removal. Now that I know this, I remember seeing the fan off and starting very slowly a couple times, but I thought that the power supply might have a semi-passive mode on low load scenarios. That might have not been the case at all, it just never mattered as this power supply rarely had to deliver north of 200 watts…
Anger aside, pure luck or other working protection circuits saved the rest of the server, so that is very good news indeed. Here’s two more shots of the PCB, solder quality isn’t exactly terrific, especially with lots and lots of uncleaned flux. They used a wild mixture of CapXon and Teapo capacitors of different series each, peaking in the use of a CapXon poly for where it apparently mattered the most. They’re 105°C types, but it’s not exactly Nichicon or Panasonic, is it.
The solder is also very shiny but according to the label it’s RoHS compliant, so no leaded solder should be present. The three top-right solder joints were touched by me when removing the heat shrinked fuse, so that wasn’t FSPs fault.
A day later I decided to connect this thing again to mains power, after I carefully removed the PCB from the metal shell, located the blown ceramic fuse and replaced it with an external fuse holder plus a matching 8A slow-blow fuse, this time only a glass tubed one. Here’s how that brand new fuse looked like after working as a very bright indicator light for a very short period of time:
Erm, yes, that explains how that thing could have made my UPS hiccup, thrown the local 16A breaker in that room AND one of the 25A breakers of the entire apartment – there’s a massive short in there, and I doubt that’s fixable. But that was to be expected given the following infrared photos, taken roughly 25-30 minutes after the incident. (sorry for all the small temperature scales, I do not have access to a regular Windows desktop right now and the crappy FLIR data format isn’t documented for use in open source software)
When I first got it out of the case, I spotted over 100°C with the camera, and even that was like 15 minutes after it happened. “The transformer is lava” probably describes the minutes around the incident best.
Well, after all that stuff was sorted I hooked up an older be quiet ATX supply which created a terrible mess outside of the case, and started again. Since the backplane does support staggered spinup and all new SAS drives do obey these orders, the server does boot in this configuration, contrary to running the disks in my other 16 bay enclosure that peaks at 500W before this 350W power supply cuts out. I then continued transferring data to the new zpool – it’s just so much slower to clean up afterwards with a 1Gb/s connection and your laptop, instead of the 10Gb/s test system or desktop – which currently have no working board or no power supply. But that whole aftermath will be covered in the second part of this post.
Safe to say I need a new SFX power supply, and by pure coincidence it will either be a Seasonic Focus SGX Gold 500W SFX12V-L (I’m running a regular Focus Gold already in the desktop), or it’ll be a FSP Dagger Pro 550W SFX12V. Yes, FSP, the guys that almost blew up ten terabytes of my data by fucking up OTP and fan status control in the old power supply. Aah, gotta love the huge variety in SFX power supplies with suitable Molex connectors for your Supermicro backplane…