What can I get by filtering URLs through the Perl URI module?

I get something when I transform mine $urlas follows $url = URI->new( $url ):?

#!/usr/bin/env perl
use warnings; use strict;
use 5.012;
use URI;
use XML::LibXML;

my $url = 'http://stackoverflow.com/';
$url = URI->new( $url );

my $doc = XML::LibXML->load_html( location => $url, recover => 2 );
my @nodes = $doc->getElementsByTagName( 'a' );
say scalar @nodes;
+5
source share
3 answers

the constructor of the URI module will clear the URI for you - for example, correctly output characters that are not valid for building a URI (see URI :: Escape ).

+4
source

URI module as several advantages:

  • It normalizes the URL for you.
  • It can resolve relative urls
  • It can detect invalid URLs (although you need to disable messy bits)
  • URL-, .

, , , , , , , , URI , , .

+3

, , $url = URI->new( $url ); $url , URI (, , ), , URI. , XML::LibXML, , , URI URL-, .

+1

All Articles