Press "Enter" to skip to content

Month: July 2009

PHP: Extract just the domain name from an URL

Although I posted this on Stack Overflow, but I thought for reference purposes I’ll keep a copy here as well on how to extract the domain name from a URL in PHP: If you just want to handle three character top level domains – then this code works: <?php // let’s test the code works: these should all return // example.com , example.net or example.org $domains=Array(‘here.example.com’, ‘example.com’, ‘example.org’, ‘here.example.org’, ‘example.com/ignorethis’, ‘example.net/’, ‘http://here.example.org/longtest?string=here’); foreach ($domains as $domain) { testdomain($domain); } function testdomain($url) { if (preg_match(‘/^((.+)\.)?([A-Za-z][0-9A-Za-z\-]{1,63})\.([A-Za-z]{3})(\/.*)?$/’,$url,$matches)) { print ‘Domain is: ‘.$matches[3].’.’.$matches[4].”.”\n”; } else { print ‘Domain not found in ‘.$url.”.”\n”; } } ?> $matches[1]/$matches[2] will contain any subdomain and/or protocol, $matches[3] contains the domain name, $matches[4] the top level domain and $matches[5] contains any other URL path information. To match most common top level domains you could try changing it to: if (preg_match(‘/^((.+)\.)?([A-Za-z][0-9A-Za-z\-]{1,63})\.([A-Za-z]{2,6})(\/.*)?$/’,$url,$matches)) { Or to get it coping with everything: if (preg_match(‘/^((.+)\.)?([A-Za-z][0-9A-Za-z\-]{1,63})\.(co\.uk|me\.uk|org\.uk|com|org|net|int|eu)(\/.*)?$/’,$url,$matches)) { etc etc. If you want a list of top level domains, you may find Mozilla’s TLD List useful and DKIM Reputation Detected Registered Domains code handy.