A shocking surprise after my UPS failed

So in my cellar I have a Linux server… and a NAS running FreeNAS… and a managed gigabit swich… and a router running pfSense… and maybe some PoE gear feeding power to an ADSL modem and a wifi access point elsewhere in the house. Yeah, I like complicated, complicated is cool.

Losing data because the power goes out isn’t cool. But having a UPS and the NUT UPS tools on my network is really cool – power goes out, the UPS starts screaming at me and all the hardware gets told and shuts down in a tidy manner. Very nice.

What isn’t nice is when the UPS appears to be faulty – randomly deciding the mains has gone off, even when it hasn’t, which is what it did at 1:55am this morning

Aug 4 01:51:57 cex upsmon[1996]: UPS ups@chi on battery
Aug 4 01:55:22 cex upsmon[1996]: UPS ups@chi battery is low
Aug 4 01:55:27 cex upsmon[1996]: UPS ups@chi: forced shutdown in progress

Nice to know the system works, but would have been nicer if it only shut down when the actual mains had actually gone off, for real (like it did the other day at about 3pm). Seems I also need a better UPS, 4 minutes of power is a bit … short.

So anyway, I woke up this morning and realised I had No Internet and after entering my underground data centre I was greeted with that eerie silence computers-that-are-supposed-to-be-on-but-aren’t make. Muttering at my crappy UPS I stuck my arm across the first machine to turn it on and received a bit of a zap.

It was like this, only not as bad.

Nothing too bad, kind of like licking a 9v battery, but on my bare arm (which is neither a tongue nor damp). And this is where things got weird. I got my multimeter, stuck it on AC voltage and put one of the probes onto the metal casing, the other probe wasn’t touching anything. The meter read 8VAC… Bit weird, so I took hold of the free probe and tapped it with my finger… the meter then read 115VAC.

Yes America, the voltage between my PC case and my finger read the same as your wall voltage. And here I was, stood barefoot on a slightly damp stone floor going “hmm this is indeed strange, seems there’s 115V going down my finger”. OK so the actual current must have been tiny since I wasn’t lying on the stone floor gently convulsing, but still, clearly we had a bit of a ground fault.

And yes, the sockets in this house are protected by an RCD. It didn’t trip, so I guess the current must have been really small and it was only the relatively high voltage that made me notice it. I have an 88 amp-hour 12V DC battery and can hold its terminals all day without feeling anything. Volts thrill, amps kill, but you need both together to grill.

After some probing with the continuity tester I realised there was no continuity between the wall socket’s earth pin and that PC I had been touching. Nor was there any continuity between the casing of the PC under it and this PC. A bit more probing with a multimeter showed me there was no continuity between the earth pins on each end of that PC’s power cable.

So yeah, I’d been drawing about 60W of 240V quite happily without an earth connection, gradually letting the casing charge up. No wonder the UPS had been going strange – the ground pin of the USB connector was probably sending this unwanted current into the UPS’s monitor and control electronics.

I’ve removed the offending power cable and cut it in half. The bit with the IEC plug has proper continuity, the bit with the plug on doesn’t. Later I’ll cut the plug open and see if there’s anything interesting in it.

This would also explain a few strange things I’ve noticed in the past. The UPS would only seem to go crazy if its USB management cable was attached. My old server would freak out whenever I plugged the UPS in – telling me there was ESD on the USB port. At one time I had a USB SATA enclosure that uses an IEC mains cable for its PSU. This SATA enclosure always seemed a bit odd, in that its metal casing always felt “strange” (if you’re one of those people who can touch an aluminium Macbook and feel a weird ‘bumpiness’, you’ll understand). I guess the common thing between them all was this mains cable.

Why did I keep using a clearly faulty mains cable? Well it was a convenient short length – it was only 75cm long, ideal for keeping wiring tidy. And it was a cable with moulded plugs that contain all the required UK safety logos and numbers.

Focus Stealing is Bad

From Wikipedia:

Focus stealing is when a program not in focus (e.g minimised or the in background) places a window in the foreground and redirects all keyboard input to that window. This is considered a major annoyance by most users because the program may steal the focus while their attention is not on the computer screen, such as when typing while reading copy to the side. This will cause everything typed after the window appeared to be lost.

(From their entry on Focus Stealing)

Not only might it cause you to lose work, accidentally delete data or send things to the printer, but it really disrupts your workflow. A few minutes ago I was working on a PowerPoint presentation, and had Outlook open in the background. Without warning it just forced its way to the front to ask me the terribly important question of whether I want to AutoArchive my emails.

Would have been much better if it’d flashed the task bar at me instead. With the exception of a critical event such as a battery about to run out, a hard disk in danger of catastrophic failure, or something else where there is an immediate danger of data loss/hardware damage should the user be interrupted.

It’s not just Windows that does this. Yesterday I managed to disable my UPS on my Linux server, but kept the USP software running. Being a critical event, the UPS software started sending alerts to every logged in console in the hope I would see it. I did because it smeared all over my IRC client’s display.

This is a valid time when focus stealing is appropriate. Unfortunately it then became highly irritating since the error message kept appearing even when I was attempting to fix the problem. It’s not easy reading documentation or editing config files when

Broadcast message from (root):

Device ‘BelkinUPS’ is not responding, blah blah blah fix it now blah blah

Is being scrawled all over your screen every 30 seconds.