Extract only first level paragraphs from html
I have the following html:
<div id="myID">
<p>I want this</p>
<p>and I want this</p>
<div>
<p>I don't want this</p>
</div>
</div>
I want to extract only the first level elements <p>...</p>.
I tried using a great library simple_html_dom, for example. $html->find('#myID p')but in the above case it finds all three elements<p>...</p>
Is there a better way to do this?
Instead of using some external library, why don't you just use the built-in classes to handle dom?
First, create an instance of the DOMDocument , using your HTML:
$dom = new DOMDocument();
$dom->loadHtml($yourHtml);
After that, use DOMXPath to select your items:
$xpath = new DOMXpath($dom);
$nodes = $xpath->query("//*[@id='myID']/p");
var_dump($nodes->length); // outputs 2