Warning: extremely long post (2,838) words!
Well, a lot has been happening in the “World of the Open Directory Project” (a.k.a. DMoz) in the last couple of weeks.
First of all, because of server load issues the internal editor forums have been moved to a new server (yippee!) that authenticate with the main server to ensure only valid users can log in. Good idea, but it’s had a few teething troubles (boo!).
Secondly, to help reduce the load on the main part of the ODP, editors have been given a “special” port number on which to connect to edit (hopefully reducing some of the overloading issues on the Apache webserver) – all good. Except if you are behind a corporate firewall and they block that port number 🙁
Thirdly, the “mirror server” at http://ch.dmoz.org/ (which is hosted by a fellow editor in Zurich, Switzerland) seems to have “taken off” a bit and is being used by a larger number of people now (mainly as it’s a lot faster) – it’s transferring around 19Gb of data a month.
Fourthly, the ODP staff members have managed to produce “a” copy of the RDF dump. The RDF dump is, in fact, a big big big file which contains the URLs, titles and descriptions of all the (nearly) 4million sites listed in the ODP. Due to a large number of technical issues, this dump hasn’t been correctly produced since September last year. The RDF dump is usually downloaded by organisations such as ‘Google’ to produces localised copies of the ODP (for instance the “PR enhanced” listings at the Google Directory). ODP staff have (this week in fact) managed to produce an RDF dump which is available via rdf.dmoz.org: there’s only a slight problem. It doesn’t contain “catid”s (unique category identifier numbers) – this is because these numbers got “clobbered” during the technical problems and so ODP staff are having to manually correct these database problems. Hopefully they’ll be fixed soon – but at least the ODP search has now been updated (since that uses the RDF dump) and there is an RDF dump for others to download and play with (which I’m intending to do this weekend).