Removing anchors from text

I need to remove anchor tags from some text and it seems like I can't do it with regex.
Just anchored tags, not their contents.
For example, <a href="http://www.google.com/" target="_blank">google</a> will become google .

+7
source share
7 answers

Precisely, this cannot be done correctly using a regular expression.

Here is an example of using the DOM:

 $xml = new DOMDocument(); $xml->loadHTML($html); $links = $xml->getElementsByTagName('a'); //Loop through each <a> tags and replace them by their text content for ($i = $links->length - 1; $i >= 0; $i--) { $linkNode = $links->item($i); $lnkText = $linkNode->textContent; $newTxtNode = $xml->createTextNode($lnkText); $linkNode->parentNode->replaceChild($newTxtNode, $linkNode); } 

It is important that the changes are reversed when changes are made to the DOM.

+12
source

Then you can try

 preg_replace('/<\/?a[^>]*>/','',$Source); 

I tried it online here on ruble

+10
source

You are looking for strip_tags() .

 <?php // outputs 'google' echo strip_tags('<a href="http://www.google.com/" target="_blank">google</a>'); 
+7
source

This question has already been given, but I thought I would add my solution to the mix. I like it better than the decision made, because it is a bit more.

 $content = preg_replace(array('"<a href(.*?)>"', '"</a>"'), array('',''), $content); 
+4
source

using regex:

preg_replace('/<a[^>]+>([^<]+)<\/a>/i','\1',$html);

+3
source

Try:

 $str = '<p>paragraph</p><a href="http://www.google.com/" target="_blank" title="<>">google -> foo</a><div>In the div</div>'; // first, extract anchor tag preg_match("~<a .*?</a>~", $str, $match); // then strip the HTML tags echo strip_tags($match[0]),"\n"; 

exit:

 google -> foo 
0
source

Most of the regex didn't help me here. Some of them delete the contents inside the anchor (which is not at all what the OP requested), and not all the contents in it, some of them will match any tag starting with a, etc.

This is what I created for my needs at work. We had a problem when transferring HTML to wkhtmltopdf with anchor tags (with many data attributes and other attributes) sometimes prevented the creation of a PDF file, so I wanted to delete them while preserving the text.

Regex:

/ </? a ([^>] *)? > / ig

In PHP you can:

 $text = "<a href='http://www.google.com/'>Google1</a><br>" . "<a>Google2</a><br>" . "<afaketag href='http://www.google.com'>Google2</afaketag><br>" . "<afaketag>Google4</afaketag><br>" . "<a href='http://www.google.com'><img src='someimage.jpg'></a>"; echo preg_replace("/<\/?a( [^>]*)?>/i", "", $text); 

Outputs:

 Google1<br>Google2<br><afaketag href='http://www.google.com'>Google2</afaketag><br><afaketag>Google4</afaketag><br><img src='someimage.jpg'> 
0
source

All Articles