para4

para5

Php-parse html page

<div>divbox</div> <p>para1</p> <p>para2</p> <p>para3</p> <table class="table"><tr><td></td></tr></table> <p>para4</p> <p>para5</p> 

can someone tell me how can i parse this html page to display ONLY para1, para2 and para3? and delete everything else.

condition:
I want to get all the content from the first <p> to the first <table class="table"> .

(the first table will always have the class "table")

output:

 <p>para1</p> <p>para2</p> <p>para3</p> 
+4
source share
1 answer
 $d = new domdocument(); libxml_use_internal_errors(true); $d->loadHTML($file); foreach ($d->getElementsByTagName("*") as $el) { if ($el->tagName == "p") echo $el->textContent, "\n"; elseif ($el->tagName == "table") break; } 

This one gives :

  para1
 para2
 para3
+7
source

All Articles