How to extract data using Goutte Crawler?

This code returned by hrefs to the content, now I want to extract the content from this hrefs and send it in my opinion. The name of the divs I need to extract:

<div class="c_pad"> <div class="c_label"> <span class="std_header2">Contact:</span> </div> <div class="c_name"> <span class="std_text_b">Monkey</span> </div> <div class="clear"></div> </div> 

 <div class="c_pad"> <div class="c_label"> <span class="std_header2">Phone number:</span> </div> <div class="c_phone"> <span class="std_text_b">001111111</span> </div> <div class="clear"></div> </div> 

 for($i=0; $i <= 1; $i++) { $p = new Client(); $d = $p->request('GET', ''.$link.'&std=1&results='. $i); $n = $d->filter('a[class="o_title"]')->each(function ($node) { $pp = new Client(); $dd = $pp->request('GET', $node->attr('href')); $kk = $dd->filter('div[id="adv_desc"]')->each(function ($tekst) { echo $node->attr('href').'<br>'.$tekst->text(); }); }); } 
+5
source share
1 answer

You want to filter specific tags with attributes.

But you use $d->filter('a[class="o_title"]') . This filters the tag a with the attribute class="o_title" . And this is not part of your content.

You just need to configure the node filter to select the correct elements.

Use jQuery selector syntax: https://api.jquery.com/category/selectors/

Link to the Symfony DomCrawler documentation used by Goutte: http://symfony.com/doc/current/components/dom_crawler.html#node-filtering

+3
source

All Articles