A shocking surprise after my UPS failed

So in my cellar I have a Linux server… and a NAS running FreeNAS… and a managed gigabit swich… and a router running pfSense… and maybe some PoE gear feeding power to an ADSL modem and a wifi access point elsewhere in the house. Yeah, I like complicated, complicated is cool.

Losing data because the power goes out isn’t cool. But having a UPS and the NUT UPS tools on my network is really cool – power goes out, the UPS starts screaming at me and all the hardware gets told and shuts down in a tidy manner. Very nice.

What isn’t nice is when the UPS appears to be faulty – randomly deciding the mains has gone off, even when it hasn’t, which is what it did at 1:55am this morning

Aug 4 01:51:57 cex upsmon[1996]: UPS ups@chi on battery
Aug 4 01:55:22 cex upsmon[1996]: UPS ups@chi battery is low
Aug 4 01:55:27 cex upsmon[1996]: UPS ups@chi: forced shutdown in progress

Nice to know the system works, but would have been nicer if it only shut down when the actual mains had actually gone off, for real (like it did the other day at about 3pm). Seems I also need a better UPS, 4 minutes of power is a bit … short.

So anyway, I woke up this morning and realised I had No Internet and after entering my underground data centre I was greeted with that eerie silence computers-that-are-supposed-to-be-on-but-aren’t make. Muttering at my crappy UPS I stuck my arm across the first machine to turn it on and received a bit of a zap.

jimmy
It was like this, only not as bad.

Nothing too bad, kind of like licking a 9v battery, but on my bare arm (which is neither a tongue nor damp). And this is where things got weird. I got my multimeter, stuck it on AC voltage and put one of the probes onto the metal casing, the other probe wasn’t touching anything. The meter read 8VAC… Bit weird, so I took hold of the free probe and tapped it with my finger… the meter then read 115VAC.

Yes America, the voltage between my PC case and my finger read the same as your wall voltage. And here I was, stood barefoot on a slightly damp stone floor going “hmm this is indeed strange, seems there’s 115V going down my finger”. OK so the actual current must have been tiny since I wasn’t lying on the stone floor gently convulsing, but still, clearly we had a bit of a ground fault.

And yes, the sockets in this house are protected by an RCD. It didn’t trip, so I guess the current must have been really small and it was only the relatively high voltage that made me notice it. I have an 88 amp-hour 12V DC battery and can hold its terminals all day without feeling anything. Volts thrill, amps kill, but you need both together to grill.

After some probing with the continuity tester I realised there was no continuity between the wall socket’s earth pin and that PC I had been touching. Nor was there any continuity between the casing of the PC under it and this PC. A bit more probing with a multimeter showed me there was no continuity between the earth pins on each end of that PC’s power cable.

So yeah, I’d been drawing about 60W of 240V quite happily without an earth connection, gradually letting the casing charge up. No wonder the UPS had been going strange – the ground pin of the USB connector was probably sending this unwanted current into the UPS’s monitor and control electronics.

I’ve removed the offending power cable and cut it in half. The bit with the IEC plug has proper continuity, the bit with the plug on doesn’t. Later I’ll cut the plug open and see if there’s anything interesting in it.

This would also explain a few strange things I’ve noticed in the past. The UPS would only seem to go crazy if its USB management cable was attached. My old server would freak out whenever I plugged the UPS in – telling me there was ESD on the USB port. At one time I had a USB SATA enclosure that uses an IEC mains cable for its PSU. This SATA enclosure always seemed a bit odd, in that its metal casing always felt “strange” (if you’re one of those people who can touch an aluminium Macbook and feel a weird ‘bumpiness’, you’ll understand). I guess the common thing between them all was this mains cable.

Why did I keep using a clearly faulty mains cable? Well it was a convenient short length – it was only 75cm long, ideal for keeping wiring tidy. And it was a cable with moulded plugs that contain all the required UK safety logos and numbers.

Broken USB – very bad!

Oh dear, the USB subsystem on my server has died…

[3704510.037506] uhci_hcd 0000:01:0b.1: host system error, PCI problems?
[3704510.037564] uhci_hcd 0000:01:0b.1: host system error, PCI problems?
[3704510.037611] uhci_hcd 0000:01:0b.1: host controller process error, something bad happened!
[3704510.037660] uhci_hcd 0000:01:0b.1: host controller halted, very bad!
[3704510.037716] uhci_hcd 0000:01:0b.1: HCRESET not completed yet!
[3704510.037723] uhci_hcd 0000:01:0b.1: HC died; cleaning up

I think I’ll just be rebooting my computer now then. Yes, reboot time.

And I’d just got CUPS working properly too, and was trying the ambitious thing of printing from an RDP connection via my desktop PC to the printer plugged into my server. The printjob made it out the remote machine and into the server’s print queue so I think it’ll work.

Remote nerd alert

And now for something utterly pointless and wholly nerdy…

I like to use IRC to chat to people. I dislike Windows IRC clients, and much prefer irssi in Linux, which is great because I have a Linux server that I can run it on. Even better is the combination of SSH and GNU Screen, allowing me to SSH into my machine from anywhere while preserving the state of my login.

I also have access to a Windows machine via RDP which is in someone’s office, about 50 miles from here.

For no real reason I thought it’d be amusing to run PuTTy on the remote Windows machine (via the Remote Desktop application on my XP machine), and use it to SSH into my server to chat on IRC.

So, I am using remote desktop which sends compressed 1440×960 bitmaps across the Internet so that I can run PuTTy to see ASCII text via SSH to the computer sat next to me. So efficient, encrypted, compressed SSH data leaves my server, goes through the Internet to the Windows machine. There it gets drawn on the screen inside PuTTy, with the whole screen being captured, compressed and sent back over the Internet to my Windows PC.

Oh, did I mention the remote Windows PC isn’t actually a real PC? It’s a VMWare virtual machine instance running under Windows Server 2003 on a 2.67GHz Core i7.

Earlier I had it printing from the remote machine to the printer plugged into my own PC, which was actually useful, unlike the mess I just described above which was completely pointless but fun. No, I have no life, I’m about to sit and write some SQL.

Ubuntu 9.10 – Tedious Timewaste

I’m attempting to install Ubuntu 9.10 server edition on my server. To say it’s not going smoothly would be as big an understatement as saying “That Hitler bloke, he was a bit naughty, wasn’t he?”. The damn thing just won’t boot up! It gets as far as saying ‘Grub Loading.’ and then gets no further.

At first I thought it might be the weird combination of IDE controllers and disks I have. I have a 1TB SATA drive, plus two PATA drives. The machine is supposed to boot from one of the PATA drives, and use the SATA as a data drive. This used to work. It even used to work with some crazy extra IDE card in the machine. The motherboard has some half-baked combination of IDE, IDE-RAID and SATA, giving me a total of ten possible disks in the machine. Whoever designed this motherboard was going for a bit of everything, the machine even takes DDR and DDR2 RAM.

Thinking that maybe all this crap was confusing things I switched it all off and pulled out every drive except the drive I wanted to boot from and reinstalled Ubuntu on that. GRUB was installed, it all went well… then the machine rebooted and sat there looking like an oversized doorstop.

I know the BIOS can find the correct disk because I see the ‘GRUB Loading.’ message, but then it seems GRUB fails to find the rest of itself and stops working.

My next plan is to install onto a spare SATA disk I have to see what happens. If that fails I’ll install a previous version of Ubuntu to see if they broke something in this version. It seems they’ve switched to something called Grub2, which has lots of new cool features. Is “booting my system” one of these new features?

75C CPU Temp

My server just locked up. After connecting a screen I saw a kernel panic, so figuring something must have gone wonky I rebooted the machine. While doing this I noticed the computer was rather warm. Warm to the point where, crawling around on the floor next to it, I could feel the heat radiating off the casing on my face. After booting back up I noticed the CPU temp was at 75C and the system temperature was around 45C. I think the system temp is the ambient temperature of the motherboard, so the inside of the computer was probably close to that.

It now has three more fans in it, whirring away and I can feel a definite airflow through the vents now. This is probably a good thing since I burnt myself on the southbridge heatsink.

I’m tempted to move the machine downstairs into my front room, where it is always cold. The problem is where to place it, and the fans are a bit noisy which would be irritating while watching TV. I have space under my stairs where it’s cool, but it also gets damp there.

Server all sorted now

After spending all night and day shuffling data off old hard drives onto my new terabyte drive, everything is complete. Some of the drives in my server were really slow and it’s only because it was attached to my network that I never noticed. My main video drive, for example, was managing a whole 2 megabytes per second. It took ages to empty that!

After removing the five old drives and the ATA controller card the machine draws 100w of power. I have left the kill-a-watt plugged in permanently and will watch it out of mild interest. Strangely the UPS draws 50w with no load.

With all my data moved across I have 583GB of space free. This should do me for a few years if I remember to clean up and periodically delete accumulated junk.

For those of you who are interested, have some stats:

Disk /dev/sda: 1000.2 GB, 1000204886016 bytes
255 heads, 63 sectors/track, 121601 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1               1      121601   976760001   83  Linux

Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA      Model: MAXTOR STM310003 Rev: MX15
  Type:   Direct-Access                    ANSI SCSI revision: 05

/dev/sda1:
 Timing cached reads:   578 MB in  2.00 seconds = 288.34 MB/sec
 Timing buffered disk reads:  226 MB in  3.01 seconds =  74.98 MB/sec

One Terrorbyte of space! (or around 870GB if you can count properly)

Filesystem            Size  Used Avail Use% Mounted on
/dev/hda3              14G  2.0G   12G  15% /
varrun                221M  300K  220M   1% /var/run
varlock               221M     0  221M   0% /var/lock
procbususb            221M  120K  221M   1% /proc/bus/usb
udev                  221M  120K  221M   1% /dev
devshm                221M     0  221M   0% /dev/shm
/dev/hda1              99M   32M   63M  34% /boot
/dev/hdb1              58G   39G   19G  69% /data
/dev/hdc1              38G   12G   26G  32% /data/pub/pictures
/dev/hdd1              74G   54G   16G  78% /data/pub/audio
/dev/sda1             113G   57G   51G  53% /data/backups
/dev/sdb1             147G  131G  8.3G  95% /data/pub/video
/dev/sdc1             917G   17G  854G   2% /mnt

See the tiddly hard disks that are mostly full in that list? They’re all going to be removed and replaced with that nice, shiny 1TB drive. Rather than having six drives in my computer chewing away at my electricity bill, there will be two – a PATA boot drive and the SATA data drive.

Copying the data across takes quite a long time though.

A collection of Amigas and a WYSE Terminal

I’ve just acquired an interesting collection of old computer kit. I now have:

  • An Amiga 500 without PSU
  • An Amiga 1500 with monitor and keyboard
  • A WYSE serial terminal
  • And a USB floppy drive

The A500 will probably end up sitting on a shelf somewhere until I can find a PSU for it. I’ll use the A1500 since it has a hard disk and a monitor; I don’t exactly trust 21 year old floppy disks or have a spare TV to use. The Amiga 1500 has a slightly damaged keyboard, but it’s only a few keys on the numberpad that I’ll probably never press anyway.

The terminal might have a problem that causes it to turn off randomly, but I tested it the other night for about an hour and it seemed OK. Once I’ve confirmed I can connect it to my Linux machine I’ll create a giant serial cable and put it downstairs somewhere. It’ll make a nice IRC client or quick login to my server to check things. I figure I can make a long serial cable from some spare Cat5 cable and DB9 connectors.