Name tags from the string A reg...">

You need a regular expression to remove the <a href= "https://stackoverflow.com/xx" rel="nofollow noreferrer"> Name </a> tags from the string

A regular expression is required to remove a tag from the following <a href="http://example.com">Name</a> URL to display only the "Name" string. I am using C # .net.

Any help is appreciated

+4
source share
5 answers

This will be a very good job:

 str = Regex.Replace(str, @"<a\b[^>]+>([^<]*(?:(?!</a)<[^<]*)*)</a>", "$1"); 
+27
source

You should watch the Html Agility Pack . RegEx works in almost all cases, but it is not suitable for some basics or broken Html. Since HTML grammar is not regular, the Html Agility pack still works great in all cases.

If you only search once for this particular case of anchor tag, any of the above RegEx will work for you, but the Html Agility Pack is your long, solid decision to remove any HTML tags.

Link: Using C # regular expressions to remove HTML tags

+3
source

You can try using this one. It has not been tested under any conditions, but it will return the correct value from your example.

 \<[^\>]+\>(.[^\<]+)</[^\>]+\> 

Here is the version that will work only for tags.

 \<a\s[^\>]+\>(.[^\<]+)</a\> 

I tested it in the following HTML document and it returned the Name and Value .

 <a href="http://xx.com">Name</a><label>This is a label</label> <a href="http://xx.com">Value</a> 
0
source

Agree with Priyank that using a parser is a safer bet. If you use a regex route, think about how you want to handle extreme cases. It is easy to transform the simple case that you spoke about in your question. And if this is really the only form of markup, a simple regular expression can handle it. But if the markup, for example, was created by the user or from a third-party source, consider cases such as:

 <a>foo</a> --> foo # a bare anchor tag, with no attributes # the regexes listed above wouldn't handle this <a href="blah"><b>boldness</b></a> --> <b>boldness</b> # stripping out only the anchor tag <A onClick="javascript:alert('foo')">Upper\ncase</A> --> Upper\ncase # and obviously the regex should be case insensitive and # apply to the entire string, not just one line at a time. <a href="javascript:alert('<b>boom</b>')"><b>bold</b>bar</a> --> <b>bold</b>bar # cases such as this tend to break a lot of regexes, # if the markup in question is user generated, you're leaving # yourself open to the risk of XSS 
0
source

The following works for me.

 Regex.Replace(inputvalue, "\<[\/]*a[^\>]*\>", "") 
0
source

All Articles