This quick awk one-liner (tested on Linux) will extract all referer (referrer) entries from a combined Apache log file (called LOGFILE) excluding entries such as “”, “-“, (empty) and ones containing DOMAIN. Useful if you want to check where external referers are coming from
awk '($11 !~ /^"?-?"?$/ && $11 !~ /DOMAIN/) {print $11}' LOGFILE
Richy's Random Ramblings
Just a few bullet points about some features which would encourage me to buy a new Sat Nav:
- Able to set route via computer
- I can tweak a route via Google Maps to avoid areas I don’t like for whatever reason, or to “swing by” places – I’d love to be able to just then click “Send to Satnav” and for it to be picked up
- Traffic planning
- Will it be quicker for me to go via the A50 or the M1 tomorrow morning and what time should I leave to be at my destination for 8.50am? Surely Satnavs can get traffic reports, trend them over time, and then suggest the best route for the time of day (and flag if the traffic flow is already appearing different today as opposed to yesterday/last week)
- Alternative transportation options
- Could it even be faster for me to not drive to my destination, but to drive to one of Leicester’s Park and Ride facilities and then catch the bus in from there? How about drive to Loughborough train station and then catch the train in? If I could set a “Petrol costs XX much, a day parking costs XX, trains from YY costs XX, park and rides cost XX: which would be the cheapest/quickest/most reliable routes?”
This post was originally published, commercially, in “Archive Magazine” Volume 13/Issue 9 (June 2000) pages 45-47.
This article was written for publication by Archive Magazine in response to a request by the editor for an article covering FTP.
What Is FTP?
Richard Chiswell
A few years ago (around the time of the BBC Micros), the easiest and simplest way to transfer files from one computer to another was via ‘sneaker net’ – the (American) term given to the method of copying files to a floppy disc (or even tape) and taking it to another machine. Nowadays this isn’t so simple, as many people (such as myself) keep files on many machines many miles apart. For example, my website is hosted on a machine in London while I’m based in Leicester. I’m certainly not going to sit on a train for a few hours just to be able to make a few small changes to my website.
Hence, in June 1971 (very, very old by internet standards – think around 100 years in real life times), FTP was designed and implemented. FTP is an abbreviation for ‘File Transfer Protocol’ and, as its name suggests, it is a method of transferring files – allowing you to easily and simply upload and download files from remote machines. For the complete technical specification on how it all works, see RFC 172 and RFC 959 (Request for Comments). The details contained in the RFCs are quite complex and you will only really need to read them if you are designing an FTP client or want to get in-depth information about it all.
What’s This To Do With RISC OS?
Well, many people (including myself) maintain websites using our RISC OS machines – after all, it is our ‘platform of choice’ and I find development easier under RISC OS than on other platforms (for example, Zap’s colour coding for Perl and HTML beats any PC software that I know of).
First of all, you need a working internet stack. An internet stack is the program that actually dials up your internet service provider (ISP) and connects you to the internet. The most common internet stacks on RISC OS are the ANT Internet Suite, Acornet or the Argonet Voyager suite. So if you haven’t installed any of those then go get a copy and install it now – you need to be able to connect to the internet for this article to be at all useful for you. The very old Doggysoft Termite internet stack didn’t conform to the ‘Acorn’ standard and very few third-party programs will work with it; as it isn’t being actively supported or developed any more, you may as well upgrade now.
Next, you’ll be needing what is called an ‘FTP client’. These allow you to upload and download files using FTP. Yes, I know Fresco, ArcWeb, Browse et al support FTP downloads, but you need an FTP client to be able to upload files. Personally, I find ANT’s !FTP and Colin Granville’s FTPc the best FTP clients for my use. Why two? Well, FTPc supports ‘recursive files’ (I can just drag a directory to it and it will upload all the contents of the directory and subdirectories), whereas !FTP allows me to see ‘hidden files’ on my Posix machines (i.e. .htaccess etc).
What Is Posix?
As this term crops up quite a few times in this article, I thought I might as well tell you what it means. Posix is the now accepted term for Unix, Linux, NetBSD, RiscBSD type computer systems – the term used to be ‘Unix-clone’ but AT&T, who own the trademark on the Unix name have previously raised objections, so the terminology has changed to ‘Posix’ or ‘Posix-clone’.
Most of the internet services you access use Posix themselves. If you dial up to connect to the internet, your ISP’s servers probably use Posix to hold the log-on details. If you browse a website, it is most likely to be hosted on a Posix machine (either that or Microsoft Windows NT), if you send email it will pass through Posix machines. Internet systems which don’t actually run Posix (such as Microsoft Windows and, of course, RISC OS) have to try and ‘pretend’ to be Posix because of the standard that has evolved. This means that filenames should be case-sensitive and of any length, that directories are separated by a / instead of the RISC OS ‘.’ and file extensions (such as .txt and .html) are optional and of any length. Most of the time, you won’t have to worry about the differences as the internet programs you use will make the conversion themselves, but it is a ‘good thing to know’.
Posix is also aware of things like ‘user groups’, ‘user permissions’, ‘hidden files’ (these normally start with a dot as in the ‘common’ files: .htpasswd, .htaccess and .sig) and symbolic links (where a file or directory can appear to be in two places at once, but only one copy actually exists).
About FTP Sites
You may notice that FTP sites follow the ‘Posix directory format’ – files are in the format directory/file.extension instead of the RISC OS format directory.file/extension – but again don’t worry as your FTP client should convert the files to RISC OS format automatically. Also, file extensions are completely optional on FTP sites, as they are on RISC OS, but you will probably find that most files have extensions for the benefit of PC users.
FTP files can be transferred in two formats – ASCII and binary. Most (if not all) FTP clients nowadays have an ‘auto’ setting that will sort this out for you, so you will probably never have to worry about it. The main difference is that you cannot transfer programs and data as ASCII text as it may contain ‘top- bit’ characters which may not transfer correctly, so the ‘binary’ format transfers them slightly differently. Likewise, if you transfer ASCII files (such as a text file) in binary, you may end up with garbage. So try to keep ‘auto’ selected.
What Are Top-Bit Characters?
The internet, for historical reasons, was mostly designed around the 7-bit character set, whereas modern computers normally use 8- bit characters. What’s the difference? Well, in a 7-bit system, you have a smaller selection of characters – the highest ASCII code is 127, which gives you all the letters (in both cases) and all the numbers and a few other characters, but that’s it. In an 8-bit system, the ASCII code limit goes up to 255 and this allows all sorts of exotic characters such as the £ pound sterling symbol, super-script characters such as 1, accented characters, the copyright © symbol and quite a few more.
The problem arises because program code (including archives, graphics and sound files) sometimes contain code which uses these exotic characters, so if you transfer them onto a 7-bit system and back off again, all the files are corrupted. When you transfer an FTP file in ‘binary’ mode, it encodes these files in a special format to get around this possible corruption problem, but in ASCII mode, it doesn’t bother.
The main thing to remember is to use ‘auto’ mode if you are offered it and, failing that, to use ASCII to transfer plain text (such as HTML and Zap/StrongEd/Edit text files) and binary for anything else (such as graphics, Ovation Pro documents, MP3s etc).
Anonymous FTP
If you are downloading from a publicly accessible site (which you most likely will be), you should use a method called ‘anonymous’ FTP. This is where you don’t have to log on to the remote server to access the files. However, owing to the way FTP works, you do need to send a ‘username’ and ‘password’. In anonymous FTP, the username is ‘anonymous’ and the password is normally your email address. Anonymous accounts usually have restrictions – this stops you from uploading files, or makes sure you upload them only to special places on the FTP server. Other limitations could include the number of files downloaded or which directories you can browse.
So, let’s try it. We’ll try accessing the RISC OS software stored on Demon Internet’s FTP server (you don’t need to be a customer of theirs to access these files – so don’t worry). Load your internet stack and FTP client and connect to the internet. Once connected, enter the following details into the FTP client (see the documentation for full details).
Host (or Server) ftp.demon.co.uk
Path (or Directory) /pub/acorn/
User anonymous
Password [your email address]
Account [Leave blank]
Then just click connect. Your FTP client will connect to the FTP site and log you in and give you access to the files. Try downloading a file or two from the site. Once you’ve finished, disconnect from the server.
Non-Anonymous FTP
If you are maintaining your own website, or uploading data for a third party, you may need to log in to an FTP site. From what I’ve already said about anonymous FTP, you may have an idea about how this is done. However, to illustrate in detail, let’s assume that somebody wants to upload their website to their free webspace on Demon Internet. This person’s log-in name is ‘example’ and their password is ‘pa55w0rd’ (a good mix of numbers and letters helps create a good password). Demon’s free webspace server is called homepages.demon.net. So let’s fire up an FTP client and enter the details:
Host (or Server) homepages.demon.net
Path (or Directory) [leave blank]
User example
Password pa55w0rd
Account [Leave blank]
And then you’ll be able to download the files from your website.
A better example of this is Demon’s ‘Batch-FTP’ facility, whereyou send commands to Demon in an email in the following format:
To ftp@demon.net
Subject [anything]
Body ftp.demon.co.uk:/pub/doc/Services.txt
quit
Demon will then fetch the file in question (which, in this example, is the file Services.txt in the directory pub/doc on ftp.demon.co.uk) and store it for you. To retrieve the files, you can connect as:
Host (or Server) ftp.demon.co.uk
Path (or Directory) [leave blank]
User example
Password pa55w0rd
Account [Leave blank]
This is extremely handy when transferring large files (such as the IMDB data) from an FTP site to a ‘closer site’ which will be quicker for you to fetch.
Uploading Data
Most RISC OS FTP clients now have a ‘filer-like’ interface which means they will look very similar to the normal RISC OS filer and operate in a very similar manner. So, to upload files, you just drag the file in question to the FTP client’s window. The file will then be uploaded to the FTP site.
One point to bear in mind is that ‘/’ is translated to ‘.’, so file filename/ext will be uploaded as filename.ext. Another point is that some web hosts (mainly commercial) expect you to upload your site into a directory called public_html or similar. If you don’t, your files will not be visible via your website. A final point is to make sure that the data is transferred in the correct format – transferring text files (such as HTML) in binary format is a waste of time and uploading binary files (such as graphics, archives and executables) in ASCII format is a bigger waste of time. Your FTP client should really be set to ‘auto’, and it will detect the filetype and change mode for you automatically.
Command Line Options
If your FTP client has an option to enter command line strings, you might find this little ‘RISC OS to FTP’ converter handy. If your FTP client doesn’t support this option (and not many do), you may be able to get your ‘telnet’ to issue these commands. Don’t worry if the RISC OS versions don’t look familiar – you probably have no need to use these options in FTP if you don’t use them in RISC OS.
RISC OS command FTP Command
*Rename
*Cat ls -al
*Dir
*Delete
What next?
Hopefully this article has managed to teach you some of the basics of FTP and how useful it can be. There’s plenty more detail available, from file permissions through to protecting your website, but there’s no point in me telling you things you don’t want to know, so do get in touch and let me know what you want to hear about.
Here’s my “interest internet related dates”, what are yours? (I’ve highlighted “key dates”)
* November 1994: [general] Amazon.com founded
* November 1994: Started a public domain software library for BBC computer software
* December 1994: Changed the public domain software library to promote RISC OS software (upon purchase of an Acorn A3010)
* 1995: Started casually programming in Perl testing changes on the RISC OS Doggysoft Termite Internet software
* 26th December 1995: Got on the internet (over half my life ago). Started up with a dialup connection via Demon Internet on an Acorn A3010: involved about 3 floppy disks if I wanted to view the web (most of my time was originally on USENET newsgroups and email)
* March 1996: [general] Google started as “BackRub”
* June 1998: Commercially released BWGSMPlay as shareware software
* January 1997: [general] HTML 3.2 standard set
* Feburary 1997: “Upgraded” from the Doggysoft Termite Internet software to the ANT Internet Suite and the ANT Fresco web browser
* May 1998: Registered my first .com domain name and web hosting via Netlink internet
* June 1998: [general] PHP 3 released
* September 1998: [general] Acorn Group PLC closed down workstation division.
* October 1998: [general] Amazon acquires bookpages.co.uk which becomes Amazon UK
* November 1998: Registered my first .co.uk domain name
* December 1998: Launched a free (call costs only) internet service
* February 1999: Appeared on Meridian TV (ITV)’s cyber.cafe about the internet
* 2000: Got my first paid internet related job (reviewing websites for the UKPlus web directory)
* May 2000: [general] PHP 4 released
* June 2000: Wrote an article entitled “What is FTP?” for Archive Magazine
* February 2001: Got my first paid internet related programming job (Systems developer for Cradley Print: developing sites in Perl and MySQL [on Windows!] and then moving to PHP)
* 2001: Upgraded to a dedicated server at PositiveInternet
* September 2002: Joined WebmasterWorld
* November 2002: Setup this blog (although some entries date back before them as they were imported from other sources)
* February 2004: [general] Facebook founded as “thefacebook”
* July 2005: [general] PHP 5 released
* 2006: Met my “wife to be” at a search engine optimisation company (and we started “flirting” over MySpace)
* March 2006: [general] Twitter founded as “twttr”
* August 2006: Switched this blog from Movabletype to WordPress
* 2007: Joined Facebook
* February 2008: Got engaged
* September 2008: Setup an internet related Ltd company
* July 2010: Joined Twitter
* September 2012: Got married
I’ve just received a spam email from VistaPrint (I have been a customer of theirs before, but not under the email address they targeted, and will now never use them again) and, taking a leaf out of RevK’s book, I’m sending them a notification of breach of the The Privacy and Electronic Communications (EC Directive) Regulations 2003 and giving them 14 days to make payment as below. I’ll let you know how it goes:
NOTICE BEFORE ACTION!
You have transmitted an unsolicited communication for the purpose of direct marketing by means of electronic mail to an individual subscriber contrary to section 22 of The Privacy and Electronic Communications (EC Directive) Regulations 2003.
Under Nominet regulations (section 9.2) for .me.uk domain names (you emailed xxxxx@xxxx.me.uk), “registrants of .me.uk domain names must be, and remain at all times, natural persons” – and as the paying individual for the domain name registration and the email service theron, I am the individual subscriber as per the above regulations.
This is not a Data Protection Act issue, or an issue with your “unsubscribe” link – the regulations have been breached by sending an email without having had a sale or negotiations with me and without my consent to the email being sent.You now owe me damages as per section 30 of those regulations. If you promptly pay £15 in damages I will not pursue you for damages as per the regulations or report your breach to the ICO so that they can consider fining you.
I look forward to payment of £15 within 14 days or I will issue a county court claim against you without further notice.
Send payment to:
Sort code: xx-xx-xx
Account: xxxxxxxxx
Reference: 20130807-VISTAPRINT
In light of pre-action conduct directions under the civil procedure rules for action in the county court small claims track, and in consideration of the sums involved I invite “discussion and negotiation” as a means of Alternative Dispute Resolution (ADR), via email. If I receive no reply within 14 days, or if this discussion and negotiation does not resolve the matter, I will proceed with a county court claim without further notice.I may report this matter to the criminal enforcement authority for such breaches. They may take action and involve the CPS as well as issuing fines. If you choose to resolve this matter promptly with a payment I will refrain from doing so in this case.
I look forward to your prompt reply.
Yours,
Richard BairwellReturn-path:
Envelope-to: xxxxx@xxxx.me.uk
Delivery-date: Wed, 07 Aug 2013 11:56:31 +0000
Received: from nextmailing.biz ([89.44.179.101]:57391)
by xxxxx.xxxx.xxxx with esmtp (Exim 4.80.1)
(envelope-from)
id 1V72Lv-000168-9w
for xxxxx@xxxx.me.uk; Wed, 07 Aug 2013 11:56:31 +0000
DomainKey-Signature: [snipped in blog post]
Date: Wed, 7 Aug 2013 11:56:30 +0000
To: xxxxx@xxxx.me.uk
From: Proffesional Printing
Subject: You individual business cards + delivery included
Message-ID:
Content-Language: en
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary=”b1_b413bc1dca758330778e2edbd9e9ace4″If you have trouble reading this email, click here for a web-based version
Get 250 Premium Business Cards + FREE holder + FREE delivery for just £9.99
Boost your business
Get 250 Premium Business Cardsfor just £9.99
.
Click here
250 Premium Business Cards
Name
Phone
City
Stamp
Order Now
. Easy to create
and order online . Option to customise
with your photo or logo . Full colour printing
on quality card stock
Discover our range of expert designs in our gallery, or upload your own
View More Designs
Enhance your professional image with this great offer – hurry, ends soon!
.
Prices for Business only. Prices ex. VAT (20%). Product upgrades and uploads are not included unless otherwise specified.
Not valid on previous purchases. See our website for further details.
We provide the highest quality, full-colour graphic design and printing at the lowest prices!
.
. .
Your information is protected by SSL 128bit encryption..
Please note that we cannot monitor any emails sent in reply to this message.
If you’d like to get in touch, please visit our website.
This e-mail has been sent to xxxxx@xxxx.me.uk, respecting the legal rights regarding e-mail marketing and data protection laws in Europe. If you consider that this e-mail has been sent to you by mistake, please let us know as soon as possible.
To remove your e-mail from our mailing list, click here.