Press "Enter" to skip to content

Snippet: Server Downtime

*snippet* As some of you might have noticed, my blog was down for around 36 hours starting on Thursday night (at 23:43GMT). Basically, the server that hosts the *.beebware.(co.uk|.com) family of websites and a few others suffered a partial hard disk failure which resulted in data corruption. Luckily, it didn’t affect the “user data” part: only around a dozen files within the Linux /var/ settings (IIRC). Unfortunately, these files are needed by Apache, SSH and several other “system critical” options which meant the server practically failed 🙁

I alerted our technical guys to it (as when any of the “system critical” things fail on my server, I get lots of email alerts and SMS text alerts which wake me up) and they managed to shut down the server before any further data corruption occurred. Everything on the server is backed up on a rolling 24hour backup processes, but it’s still a case of swopping out the broken drive, putting in a new drive and copying the data across.

Due to various problems, I was given an initial estimate of “sometime on Monday”. Aarghhh! Not good at all 🙁 So what to do? Well, I had a quick word with someone I know at 3050 Hosting who managed to set me up with an emergency 30 hosting package (which is a cheap web hosting package at just £30 per year) and I just needed to point all the DNS entries at the new server after I put up a temporary “holding page”.

Why didn’t I use one of the other Beebware servers? Well, a) I didn’t want the holding page to be in the same “netblock” as the main servers (as it could have jeopardised some of the search engine placement results) and b) they didn’t have enough spare capacity for the load that the sites hosted by this server require.

No sooner had the DNS changes started to propagate across the internet, when my monitoring systems indicated the server came back on line with less than 24hours downtime (it came back up at 20:40GMT). I had set a “DNS expiry time” of 24hours – so the DNS wouldn’t start timing out until at least 3pm on Saturday…

Anyway – that’s what happened, I’m still waiting for “total confirmation” to come back from the technical team that they are happy that the server is running correctly – but everything seems A.O.K. to me at the moment: but it’s just a case waiting for DNS to re-stabilise (I didn’t think it’ll be fair to do a “Guess That Movie” competition when only a few people could probably access it).

I’ll try and get “normal service” resumed shortly…

3 Comments

  1. Many thanks – did you know that your blog appears down at the moment? Saying that, I can’t reach BigPinkCookie either and they’re both hosted on the same netblock so it could just be a routing problem my end…

  2. Ah, good to see you problems are fixed.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.