How to get website rss feed url using php?

I need to find the rss feed url of a website programmatically.

[Either using php or jquery]

+4
source share
4 answers

This is much more involved than just pasting the code here. But I can point you in the right direction what you need to do.

  • First you need to get a page
  • Parse the line where you will return, find the RSS meta tag for auto-detection . You can display the entire document as XML and use DOM traversal, but I would just use a regex.
  • Extract the href part of the tag, and now you have the URL of the RSS feed.
+3
source

The general process has already answered ( Quentin , DOOManiac ), so some code ( Demo ):

<?php $location = 'http://hakre.wordpress.com/'; $html = file_get_contents($location); echo getRSSLocation($html, $location); # http://hakre.wordpress.com/feed/ /** * @link http://keithdevens.com/weblog/archive/2002/Jun/03/RSSAuto-DiscoveryPHP */ function getRSSLocation($html, $location){ if(!$html or !$location){ return false; }else{ #search through the HTML, save all <link> tags # and store each link attributes in an associative array preg_match_all('/<link\s+(.*?)\s*\/?>/si', $html, $matches); $links = $matches[1]; $final_links = array(); $link_count = count($links); for($n=0; $n<$link_count; $n++){ $attributes = preg_split('/\s+/s', $links[$n]); foreach($attributes as $attribute){ $att = preg_split('/\s*=\s*/s', $attribute, 2); if(isset($att[1])){ $att[1] = preg_replace('/([\'"]?)(.*)\1/', '$2', $att[1]); $final_link[strtolower($att[0])] = $att[1]; } } $final_links[$n] = $final_link; } #now figure out which one points to the RSS file for($n=0; $n<$link_count; $n++){ if(strtolower($final_links[$n]['rel']) == 'alternate'){ if(strtolower($final_links[$n]['type']) == 'application/rss+xml'){ $href = $final_links[$n]['href']; } if(!$href and strtolower($final_links[$n]['type']) == 'text/xml'){ #kludge to make the first version of this still work $href = $final_links[$n]['href']; } if($href){ if(strstr($href, "http://") !== false){ #if it absolute $full_url = $href; }else{ #otherwise, 'absolutize' it $url_parts = parse_url($location); #only made it work for http:// links. Any problem with this? $full_url = "http://$url_parts[host]"; if(isset($url_parts['port'])){ $full_url .= ":$url_parts[port]"; } if($href{0} != '/'){ #it a relative link on the domain $full_url .= dirname($url_parts['path']); if(substr($full_url, -1) != '/'){ #if the last character isn't a '/', add it $full_url .= '/'; } } $full_url .= $href; } return $full_url; } } } return false; } } 

See: automatic RSS detection with PHP (archive copy) .

+13
source

The rules for providing RSS detection are well documented. You just need to parse the HTML and look for the elements described.

+1
source

A slightly smaller function that captures the first available channel, be it rss or atom (most blogs have two options - this captures the first preference).

 public function getFeedUrl($url){ if(@file_get_contents($url)){ preg_match_all('/<link\srel\=\"alternate\"\stype\=\"application\/(?:rss|atom)\+xml\"\stitle\=\".*href\=\"(.*)\"\s\/\>/', file_get_contents($url), $matches); return $matches[1][0]; } return false; } 
+1
source

All Articles