test www.example.c...">

Removing URLs Using PHP

I would like to remove only anchor tags and actual URLs.
For example, <a href="http://www.example.com">test www.example.com</a> will become test .

Thanks.

+4
source share
3 answers

To complement gd1's answer, it will get all the urls:

 // http(s):// $txt = preg_replace('|https?://www\.[az\.0-9]+|i', '', $txt); // only www. $txt = preg_replace('|www\.[az\.0-9]+|i', '', $txt); 
+1
source

I often use:

$string = preg_replace("/<a[^>]+>/i", "", $string);

And remember that strip_tags can remove all tags from a string, except for those listed in the whitelist. This is not what you want, but I also tell you about it for comprehensive information.

EDIT: I found the original source where I got this regex. I want to bring the author, for justice: http://bavotasan.com/tutorials/using-php-to-remove-an-html-tag-from-a-string/

+3
source

you should consider using the PHP DOM library for this task.

Regex is not the best tool for parsing HTML.

Here is an example:

 // Create a new DOM Document to hold our webpage structure $xml = new DOMDocument(); // Load the html contents into DOM $xml->loadHTML($html); $links = $xml->getElementsByTagName('a'); //Loop through each <a> tags and replace them by their text content for ($i = $links->length - 1; $i >= 0; $i--) { $linkNode = $links->item($i); $lnkText = $linkNode->textContent; $newTxtNode = $xml->createTextNode($lnkText); $linkNode->parentNode->replaceChild($newTxtNode, $linkNode); } 

Note:

  • It is important to use a regression loop here, because when you call replaceChild , if the old node has a different name from the new node, it will be removed from the list after replacing it, and some of the links will not be replaced.
  • This code does not remove the urls from the text inside the node, you can use preg_replace from nico on $ lnkText to the createTextNode line. It is always better to isolate parts from html using the DOM, and then use regular expressions for these text parts.
+2
source

All Articles