Extract only first level paragraphs from html

Question

I have the following html:

<div id="myID">
  <p>I want this</p>
  <p>and I want this</p>
  <div>
    <p>I don't want this</p>
  </div>
</div>

I want to extract only the first level elements <p>...</p>.

I tried using a great library simple_html_dom, for example. $html->find('#myID p')but in the above case it finds all three elements<p>...</p>

Is there a better way to do this?

+1

Darren sweeney Jun 13 '15 at 8:19

1 answer

PeeHaa · Accepted Answer · 2015-06-13T08:35:17+0000

Instead of using some external library, why don't you just use the built-in classes to handle dom?

First, create an instance of the DOMDocument , using your HTML:

$dom = new DOMDocument();
$dom->loadHtml($yourHtml);

After that, use DOMXPath to select your items:

$xpath = new DOMXpath($dom);

$nodes = $xpath->query("//*[@id='myID']/p");

var_dump($nodes->length); // outputs 2

p, id myID.