I know this after years, but why not go like this:
$dom='abcdco.jp'; $sub=preg_replace("/.*?([^\.]+)(\.((co\.\w+)|\w+))$/i",'\1\2',$dom);
it prints d.co.jp
where .*?([^\.]+)(\.((co\.\w+)|\w+))$ will mean:
.*? lazy (so that it does not capture the main domain) matches all characters until the next
([^\.]+) correspond to a group of characters that do not contain a period (that is, the main domain or the next domain) ( + , making sure that there is at least one class character) and return it later to \ 1
(\.((co\.\w+)|\w+)) matches the TLD with the previous point, whether it is .co.something or .something and returns it via \ 2 ; the plus sign does the same here
$ binds everything to the end of the line so that we can go all the way from the TLD on the left to the parts of the subdomain, no matter how many of them are
PS I do not know if there are other two-part TLDs, but they can also be added. A quick jump through https://en.wikipedia.org/wiki/List_of_Internet_top-level_domains tells me not, but if there are any, I think they are not so many.
Gg
source share