I am just starting with the parser mentioned and somehow working on the problems directly from the very beginning.
Referring to this tutorial:
http://net.tutsplus.com/tutorials/php/html-parsing-and-screen-scraping-with-the-simple-html-dom-library/
I want now just to find in the source code the contents of a div with the ClearBoth Box class
I extract the code using curl and create a simple html dom object:
$cl = curl_exec($curl); $html = new simple_html_dom(); $html->load($cl);
Then I wanted to add the contents of the div to an array called divs:
$divs = $html->find('div[.ClearBoth Box]');
But now, when I print $ div, it gives much more, despite the fact that inside the div there is no more source code.
Like this:
Array ( [0] => simple_html_dom_node Object ( [nodetype] => 1 [tag] => br [attr] => Array ( [class] => ClearBoth ) [children] => Array ( ) [nodes] => Array ( ) [parent] => simple_html_dom_node Object ( [nodetype] => 1 [tag] => div [attr] => Array ( [class] => SocialMedia ) [children] => Array ( [0] => simple_html_dom_node Object ( [nodetype] => 1 [tag] => iframe [attr] => Array ( [id] => ShowFacebookButtons [class] => SocialWeb FloatLeft [src] => http:
I do not understand why $ divs has more than just code from a div?
Here is an example of the source code on the site:
<div class="ClearBoth Box"> <div> <i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualität: Sehr empfehlenswert"></i> <i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualität: Sehr empfehlenswert"></i> <i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualität: Sehr empfehlenswert"></i> <i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualität: Sehr empfehlenswert"></i> <i class="Icon SmallIcon ProductRatingEnabledIconSmall" title="gute peppige Qualität: Sehr empfehlenswert"></i> <strong class="AlignMiddle LeftSmallPadding">gute peppige Qualität</strong> <span class="AlignMiddle">(17.03.2013)</span> </div> <div class="BottomMargin"> gute Verarbeitung, schönes Design, </div> </div>
What am I doing wrong?
source share