[thelist] PHP: getting domain names from a url - being foiled by .co.uk addresses

Kelly Hallman khallman at ultrafancy.com
Sat Dec 6 14:51:08 CST 2003


On Sat, 6 Dec 2003, Dunstan Orchard wrote:
> Manuel González Noriega wrote:
> > not a full answer but can't you just explode($host)  on '.', rather
> > than using strpos(), that way you can count() the elements of the
> > resulting array and act accordingly.
> 
> well, I could explode it (though I think it would be a bit messier?)

Here's what I was thinking when I was reading the first part of your post, 
which is what I think Manuel was getting at.. I don't think it's messy :)

$link = 'http://www.123.456.metafilter.com/mefi/28945';
$parsed = parse_url($link);
$domain = implode('.',array_slice(explode('.',$parsed['host']),-2));
// $domain == 'metafilter.com'

> but I'd still be left without knowing what's a subdomain, and what's
> a... bit at the end (what are the .com, .net, .co.uk things called?) -
> what to keep and what to discard.

Yes, that may be quite a problem, given all the possible combinations. I 
don't think you even mentioned the possibility of it being an IP address.

You may have to employ some kind of lookup array to decide how many 
components you want from the right-side. You could check the host against 
that the lookup table, then use the value as the offset you want...

$link = 'http://www.metafilter.co.uk/mefi/28945';
$parsed = parse_url($link);
$hostname = $parsed['host'];

// match domain against component table
$components = 2;
$component_table = array( 'co.uk' => 3, 'ca.us' => 3 );
foreach($component_table as $d => $c) {
  if (eregi(sprintf('%s$',$d),$hostname)) { $components = $c; } }

$domain = implode('.',
  array_slice(explode('.',$parsed['host']),-($components)));

I tested this code, it does work. Not sure a lookup table is the best
approach, but it's a possibility... Oh, you also might check the
right-most component to see if it's numeric and then return the whole
hostname, as it'd probably be an IP address. You might want to write the
whole thing as a function to make it more useful...

-- 
Kelly Hallman
// Ultrafancy




More information about the thelist mailing list