Using a C # regular expression to parse a domain name?

I need to parse a domain name from a string. The string can change, and I need the exact domain.

Examples of lines:

http://somename.de/ www.somename.de/ somename.de/ somename.de/somesubdirectory www.somename.de/?pe=12 

I need it in the following format with the domain name, tld and www , if applicable:

 www.somename.de 

How to do it with C #?

+7
c # regex
source share
4 answers

i simple used

  Uri uri = new Uri("http://www.google.com/search?q=439489"); string url = uri.Host.ToString(); return url; 

because using this, you can be sure.

+10
source share

As an alternative to regex, you can let the System.Uri class System.Uri the string for you. You just need to make sure the string contains the schema.

 string uriString = "http://www.google.com/search"; if (!uriString.Contains(Uri.SchemeDelimiter)) { uriString = string.Concat(Uri.UriSchemeHttp, Uri.SchemeDelimiter, uriString); } string domain = new Uri(uriString).Host; 

This solution also filters out any port numbers and translates IPv6 addresses to its canonical form.

+13
source share

I checked the Regular Expression Library , and it looks like this might work for you:

 ^(([\w][\w\-\.]*)\.)?([\w][\w\-]+)(\.([\w][\w\.]*))?$ 
+2
source share

Try the following:

 ^(?:\w+://)?([^/?]*) 

this is a weak regular expression - it does not check the string, but assumes that it is already a URL, and receives the first word before the first slash, ignoring the protocol. To get a domain in the first captured group, for example:

 string url = "http://www.google.com/hello"; Match match = Regex.Match(url, @"^(?:\w+://)?([^/?]*)"); string domain = match.Groups[1].Value; 

As a bonus, is he also captured before the first ? so url google.com?hello=world will work as expected.

+1
source share

All Articles