Perl Mechanize find all links in Div

Question

Perl Mechanize find all links in Div

Is there a way to find all links in a specific div using Mechanize?

I tried to use find_all_links but could not find a way to get through this. eg,

<div class="sometag"> <ul class"tags"> <li><a href="/a.html">A</a></li> <li><a href="/b.html">B</a></li> </ul> </div>

+4

html perl mechanize

REALFREE Jun 22 '11 at 23:22

source share

2 answers

Web :: Scraper is useful for cleaning.

 use strict; use warnings; use WWW::Mechanize; use Web::Scraper; my $mech = WWW::Mechanize->new; $mech->env_proxy; # If you want to login, do it with mechanize. my $staff = scrape { process 'div.sometag li.tags a', 'links[]' => '@href' }; # pass mechanize to scraper as useragent. $staff->user_agent($mech); my $res = $staff->scrape( URI->new("http://example.com/") ); for my $link (@{$res->{links}}) { warn $link; }

Sorry, I have not tested this code.

+1

mattn Jun 23 '11 at 0:13

source share

Grant mclean · Accepted Answer · 2011-06-23T04:20:20+0000

A useful tool for capturing useful information from HTML files is HTML :: Grabber . It uses jQuery syntax style to reference elements in HTML, so you can do something like this:

 use HTML::Grabber; # Your mechanize stuff here ... my $dom = HTML::Grabber->new( html => $mech->content ); my @links; $dom->find('div.sometag a')->each(sub { push @links, $_->attr('href'); });

Perl Mechanize find all links in Div

More articles: