Press "Enter" to skip to content

Author: Richy B.

Follow me on Mastodon at @rbairwell@mastodon.org.uk or just the posts on this blog by following @richyb@blog.rac.me.uk .

Personal: Life Update

Well, it seems I’ve managed to sort out that data parser I was working on – it finished its first successful run at 2.34am this morning. However, I was just falling asleep before it completed as I’ve just been too tired to do much (amount of code added to the parser since last week: 6 lines of regexp’s – number of runs to attempt to parse the data – I’ve lost count!).

So now I’ve got 1Gb of SQL statements that I’ve got to load into a database whilst combining at least another 85Mb of unstructured data to it. Then I’ve got to parse out HTML/PHP pages which has dynamic remote XML/SOAP parsing systems with cachability whilst ensuring the whole database can be searched – and then I’ve got to work on stage two which is “data expansion” as then I’ll have enough data to seed the new system which will use the dynamic SOAP system to capture even more data whilst another system spiders a few items. It’s hard work (even the theoretical planning and timeframes a beast), but hopefully I’ve have a helluva good system by the month end if I can actually get time to work on it.

Why don’t I devote more time to it? Well, let’s just say this – I’ve been at work 3 days so far this week and yet I’ve spent so long in the office I’ve already done 4 days work if you count the hours (Monday was systems upgrades and hardware modifications to the sales machines, Tuesday was finishing off the coding for a dynamic employment site along with finalising some bits with two new billing systems/CRM systems I’ve written, today was “manual labour” – move lots of stuff half way across the building and up a floor and then totally rearrange and rewire the office). And tonight I need an early night as I’m going “preliminary job hunting” tomorrow.

Snippet: Microsoft Word Maximum File Size

*snippet* I’ve just tried loading a 2Gb file into Microsoft Office 2000 Word (thinking that as Word supports “Virtual Memory” at al, it’ll be able to cope with the fact I’ve only got half a gig of RAM). It chugs along for a number of minutes loading the file and then pops up:

Word cannot open this file because it is larger than 32 Megabytes in size.

I never even knew Word had a maximum file size!. But what really peeved me was the fact the Microsoft programmers didn’t implement a very small routine to check the file size before even attempting to open a file – it’s not like it’s a difficult thing to do…

Snippet: Groan… Lots of Data :(

*snippet* I’m currently in the process of rebuilding large sections of my website(s) and need to import a substantial amount of data into the new content management system. By “substantial amount” we’re talking around 2Gb of data(!). However, the data seems very very slightly corrupted (around a quarter of a record every 1million entries) so I have to run another script to correct the corruption and then rerun the parser utility. And I’ll tell you this, even on a 2.4Ghz machine, parsing 2Gb of data takes a looong time. Especially when it fails and you’ve got to restart from scratch.

Of course, once it’s parsed, I’ve then got to import it all into a MySQL database (I’m having it write the SQL statements instead of directly importing it for speed reasons), and then index it (which will take ages: believe me, once you start hitting the half-a-million row point onwards on MySQL it begins to crawl) and then link all the data together and then export it into a suitable format: no way am I going to bog down my server by having it make around 50 database requests per page!

Fingers crossed that I’ll have it all parsed by Sunday…

(That’s why there haven’t been that many blog entries: my machine begins to crawl whilst parsing – and it’s just a command line parsing system anyway: no GUI to slow it down).

Guess That Movie: LXX

Guess That Movie 70Well, while I continue to clear my holiday email backlog (down to 350 emails now – the most are either spam, automatic notifications which just require a courtesy glance, or are emails I’ve now dealt with), I’ll let you concentrate on the Guess That Movie competition round 70!

We had quite a large number of entries for Guess That Movie 69 and I’ll just like to welcome new contestants “Jennifer Cunningham”, “Lanre”, “Fluffy“, and “Jurgen” who all guessed correctly at the film The Hollow Man – but were also beaten to the first place by another new contestant – “Kevin” (Kevin Bacon perchance? 😉 ) – let’s see how many return to try and get the prizes.

For those of you who are new, read on for the details of the competition and the prizes you could win – otherwise, just make your guess by leaving a comment with your name (or alias), email address (real one please: only I can see them anyway and I hate spam), web site address (totally optional) and the name of the movie you are guessing at.