Removing anchors from text

Question

Removing anchors from text

I need to remove anchor tags from some text and it seems like I can't do it with regex.
Just anchored tags, not their contents.
For example, <a href="http://www.google.com/" target="_blank">google</a> will become google .

+7

php regex

Lior May 03 '11 at 13:28

source share

7 answers

Then you can try

 preg_replace('/<\/?a[^>]*>/','',$Source);

I tried it online here on ruble

+10

stema May 03 '11 at 13:48

source share

You are looking for strip_tags() .

 <?php // outputs 'google' echo strip_tags('<a href="http://www.google.com/" target="_blank">google</a>');

+7

Pekka 웃 May 03 '11 at 1:31 pm

source share

This question has already been given, but I thought I would add my solution to the mix. I like it better than the decision made, because it is a bit more.

 $content = preg_replace(array('"<a href(.*?)>"', '"</a>"'), array('',''), $content);

+4

user1491929 Nov 11 '12 at 5:37

source share

using regex:

preg_replace('/<a[^>]+>([^<]+)<\/a>/i','\1',$html);

+3

CSᵠ May 03 '11 at 13:36

source share

Try:

 $str = '<p>paragraph</p><a href="http://www.google.com/" target="_blank" title="<>">google -> foo</a><div>In the div</div>'; // first, extract anchor tag preg_match("~<a .*?</a>~", $str, $match); // then strip the HTML tags echo strip_tags($match[0]),"\n";

exit:

 google -> foo

0

Toto May 03 '11 at 15:01

source share

Most of the regex didn't help me here. Some of them delete the contents inside the anchor (which is not at all what the OP requested), and not all the contents in it, some of them will match any tag starting with a, etc.

This is what I created for my needs at work. We had a problem when transferring HTML to wkhtmltopdf with anchor tags (with many data attributes and other attributes) sometimes prevented the creation of a PDF file, so I wanted to delete them while preserving the text.

Regex:

/ </? a ([^>] *)? > / ig

In PHP you can:

 $text = "<a href='http://www.google.com/'>Google1</a><br>" . "<a>Google2</a><br>" . "<afaketag href='http://www.google.com'>Google2</afaketag><br>" . "<afaketag>Google4</afaketag><br>" . "<a href='http://www.google.com'><img src='someimage.jpg'></a>"; echo preg_replace("/<\/?a( [^>]*)?>/i", "", $text);

Outputs:

 Google1<br>Google2<br><afaketag href='http://www.google.com'>Google2</afaketag><br><afaketag>Google4</afaketag><br><img src='someimage.jpg'>

0

Patrick golden Feb 14 '17 at 20:55

source share

Yann milin · Accepted Answer · 2011-05-04T09:26:36+0000

Precisely, this cannot be done correctly using a regular expression.

Here is an example of using the DOM:

 $xml = new DOMDocument(); $xml->loadHTML($html); $links = $xml->getElementsByTagName('a'); //Loop through each <a> tags and replace them by their text content for ($i = $links->length - 1; $i >= 0; $i--) { $linkNode = $links->item($i); $lnkText = $linkNode->textContent; $newTxtNode = $xml->createTextNode($lnkText); $linkNode->parentNode->replaceChild($newTxtNode, $linkNode); }

It is important that the changes are reversed when changes are made to the DOM.

Removing anchors from text

More articles: