Perl Mechanize find all links in Div

Is there a way to find all links in a specific div using Mechanize?

I tried to use find_all_links but could not find a way to get through this. eg,

<div class="sometag"> <ul class"tags"> <li><a href="/a.html">A</a></li> <li><a href="/b.html">B</a></li> </ul> </div> 
+4
source share
2 answers

A useful tool for capturing useful information from HTML files is HTML :: Grabber . It uses jQuery syntax style to reference elements in HTML, so you can do something like this:

 use HTML::Grabber; # Your mechanize stuff here ... my $dom = HTML::Grabber->new( html => $mech->content ); my @links; $dom->find('div.sometag a')->each(sub { push @links, $_->attr('href'); }); 
+7
source

Web :: Scraper is useful for cleaning.

 use strict; use warnings; use WWW::Mechanize; use Web::Scraper; my $mech = WWW::Mechanize->new; $mech->env_proxy; # If you want to login, do it with mechanize. my $staff = scrape { process 'div.sometag li.tags a', 'links[]' => '@href' }; # pass mechanize to scraper as useragent. $staff->user_agent($mech); my $res = $staff->scrape( URI->new("http://example.com/") ); for my $link (@{$res->{links}}) { warn $link; } 

Sorry, I have not tested this code.

+1
source

All Articles