The problem with regex in php
<div class="begin">...</div> How to match html inside (including) <div class="begin"> in PHP?
I need a regular expression solution that can handle an attached file .
0
user198729
source share5 answers
Use DOM and DOMXPath instead of regular expression, you will thank me for this:
// something useful: function dumpDomNode ($node) { $temp = new DOMDocument(); $temp->appendChild($node,true); return $temp->saveHTML(); } $dom = new DOMDocument(); $dom->loadHTML($html_string); $xpath-> new DOMXpath($dom); $elements = $xpath->query("*/div/[@class='begin']"); foreach ($elements as $el) { echo dumpDomNode($el); // <-- or do something more useful with it } Trying with regular expression will lead you to the path to madness ...
+11
slebetman
source shareThis , this is very good.
In short, do not use regular expressions to parse HTML. Instead, look at the DOM classes and especially DOMDocument :: loadHTML
+2
Emil Vikstrรถm
source shareHere is your regex:
preg_match('/<div class=\"begin\">.*<\/div>/simU', $string, $matches); But:
- RegEx does not know which XML / HTML elements. For them, HTML is just a string. That is why the rest are right . Regex is not for DOM parsing . They are used to search for string patterns.
- I provided Regex because you are not going to parse the entire HTML page, but simply extract one specific piece of text from it, in which case Regex works great.
- If there is a nested DIV in the DIV, Regex will not work properly. If so, do not use Regex. Use one of the other solutions, because then you need a DOM parsing, not a string match.
- To search for strings with more or less clearly defined start and end values, use regular string functions instead, because they are often faster.
+2
Gordon
source share // Create DOM from URL $html = file_get_html('http://example.org/'); echo $html->find('div.begin', 0)->outertext; +1
karim79
source sharehere one way using string methods
$str= <<<A blah <div class="begin"> blah blah blah blah blah </div> blah A; $s = explode("</div>",$str); foreach($s as $k=>$v){ $m=strpos($v,'<div class="begin">'); if($m !==FALSE){ echo substr("$v" ,$m); } } Exit
$ php test.php <div class="begin"> blah blah blah blah blah 0
ghostdog74
source share