URL Analysis Website

Just wondering if anyone can help me further with the following. I want to analyze the url on this website: http://www.directorycritic.com/free-directory-list.html? Pg = 1 & sort = pr

I have the following code:

<?PHP  
$url = "http://www.directorycritic.com/free-directory-list.html?pg=1&sort=pr";
$input = @file_get_contents($url) or die("Could not access file: $url"); 
$regexp = "<a\s[^>]*href=(\"??)([^\" >]*?)\\1[^>]*>(.*)<\/a>"; 
if(preg_match_all("/$regexp/siU", $input, $matches)) { 
// $matches[2] = array of link addresses 
// $matches[3] = array of link text - including HTML code
} 
?>

Currently doing nothing, and I need to do this is to drop the entire URL in the table for all 16-page pages and really appreciate some help on how to make changes to the above to do this, and output the url to a text file.

+5
source share
2 answers

Use HTML Dom Parser

$html = file_get_html('http://www.example.com/');

// Find all links
$links = array(); 
foreach($html->find('a') as $element) 
       $links[] = $element->href;

URL- , URL- .

HTML . :

EDIT:

HTML, Gordon :

+5

HTML .

HTML, , DOM PHP:

$code = file_get_contents($url);
$doc = new DOMDocument();
$doc->loadHTML($code);
$links = array();
foreach ($doc->getElementsByTagName('a') as $element) {
    if ($element->hasAttribute('href')) {
        $links[] = $elements->getAttribute('href');
    }
}

, URI, , URI. , .

, PHP ( ). . RFC 3986 - URL URL HTML DOM? .

+3

All Articles