Delete domain extension

So, let's say I have just-a.domain.com,just-a-domain.info,just.a-domain.net , how can I remove the extension .com,.net.info ... and I need results in two variables: one with a domain name, and the other with an extension.

I tried with str_replace but it doesn’t work, I think only with regex ....

+6
source share
5 answers
 $subject = 'just-a.domain.com'; $result = preg_split('/(?=\.[^.]+$)/', $subject); 

This creates the next array

 $result[0] == 'just-a.domain'; $result[1] == '.com'; 
+8
source
  preg_match('/(.*?)((?:\.co)?.[az]{2,4})$/i', $domain, $matches); 

$ matches [1] will have a domain, and $ matches [2] will have an extension

 <?php $domains = array("google.com", "google.in", "google.co.in", "google.info", "analytics.google.com"); foreach($domains as $domain){ preg_match('/(.*?)((?:\.co)?.[az]{2,4})$/i', $domain, $matches); print_r($matches); } ?> 

Will output

 Array ( [0] => google.com [1] => google [2] => .com ) Array ( [0] => google.in [1] => google [2] => .in ) Array ( [0] => google.co.in [1] => google [2] => .co.in ) Array ( [0] => google.info [1] => google [2] => .info ) Array ( [0] => analytics.google.com [1] => analytics.google [2] => .com ) 
+10
source

If you want to remove the part of the domain administered by domain name registrars, you will need to use a list of suffixes such as the public suffix List .

But since walking through this list and checking the suffix for a domain name is not so effective, rather use this list only to create an index like this:

 $tlds = array( // ac : http://en.wikipedia.org/wiki/.ac 'ac', 'com.ac', 'edu.ac', 'gov.ac', 'net.ac', 'mil.ac', 'org.ac', // ad : http://en.wikipedia.org/wiki/.ad 'ad', 'nom.ad', // … ); $tldIndex = array_flip($tlds); 

Finding the best match will be as follows:

 $levels = explode('.', $domain); for ($length=1, $n=count($levels); $length<=$n; ++$length) { $suffix = implode('.', array_slice($levels, -$length)); if (!isset($tldIndex[$suffix])) { $length--; break; } } $suffix = implode('.', array_slice($levels, -$length)); $prefix = substr($domain, 0, -strlen($suffix) - 1); 

Or create a tree that represents a hierarchy of domain name levels as follows:

 $tldTree = array( // ac : http://en.wikipedia.org/wiki/.ac 'ac' => array( 'com' => true, 'edu' => true, 'gov' => true, 'net' => true, 'mil' => true, 'org' => true, ), // ad : http://en.wikipedia.org/wiki/.ad 'ad' => array( 'nom' => true, ), // … ); 

Then you can use the following to find a match:

 $levels = explode('.', $domain); $r = &$tldTree; $length = 0; foreach (array_reverse($levels) as $level) { if (isset($r[$level])) { $r = &$r[$level]; $length++; } else { break; } } $suffix = implode('.', array_slice($levels, - $length)); $prefix = substr($domain, 0, -strlen($suffix) - 1); 
+7
source

Regex and parse_url() are not a solution for you.

You need a package that uses the Public Suffix List , only in this way you can correctly extract domains with two third-level domains of the third level (co.uk, a.bg, b.bg, etc.). I recommend using TLD Extract .

Here is a sample code:

 $extract = new LayerShifter\TLDExtract\Extract(); $result = $extract->parse('just.a-domain.net'); $result->getSubdomain(); // will return (string) 'just' $result->getHostname(); // will return (string) 'a-domain' $result->getSuffix(); // will return (string) 'net' $result->getRegistrableDomain(); // will return (string) 'a-domain.net' 
+1
source
 strrpos($str, ".") 

Gives you the index for the last period in your string, then you can use substr() with the index and return a short string.

-one
source

All Articles