What regular expression can I use to get a domain name from a URL in Ruby?

I am trying to build a regex to extract a domain with respect to a URL.

for

http://www.abc.google.com/ http://abc.google.com/ https://www.abc.google.com/ http://abc.google.com/ 

should give:

 abc.google.com 
+6
ruby regex
source share
4 answers
 URI.parse('http://www.abc.google.com/').host #=> "www.abc.google.com" 

Not a regular expression, but probably more reliable than anything we come up with here.

 URI.parse('http://www.abc.google.com/').host.gsub(/^www\./, '') 

If you want to remove www. then this will work without any errors if www. does not exist.

+25
source share

I don't know much about ruby, but this regex pattern gives you the last 3 parts of the URL, excluding the trailing slash with a minimum of 2 characters for each part.

 ([\w-]{2,}\.[\w-]{2,}\.[\w-]{2,})/$ 
+1
source share

you can use the gem domain name for this kind of work. From README:

 require "domain_name" host = DomainName("abexample.co.uk") host.domain #=> "example.co.uk" 
0
source share

Your question is a bit vague. Can you give accurate information about exactly what you want to do? (Preferably with testuite.) Right now, all your questions say you want a method that always returns 'abc.google.com' . This is easy:

 def extract_domain return 'abc.google.com' end 

But this is probably not what you had in mind, and hellip;

Also, you say you need Regexp . What for? What is wrong, for example, with the URI class? After all, parsing and processing a URI is exactly what it was created for!

 require 'uri' URI.parse('https://abc.google.com/').host # => 'abc.google.com' 

And finally, you say that you are trying to extract the domain, but you will never indicate what you mean by "domain". It looks like you sometimes mean the fully qualified domain name and sometimes randomly drop parts of the fully qualified domain name, but according to what rules? For example, for FQDN, abc.google.com the domain name google.com , and the host name is abc , but you want it to return abc.google.com , which is not only the domain name, but also the fully qualified domain name. Why?

-one
source share

All Articles