You can use httparty to just get the data
Sample code (from example ):
require File.join(dir, 'httparty')
require 'pp'
class Google
include HTTParty
format :html
end
pp Google.get('http://google.com')
puts '', '*'*70, ''
pp Google.get('https://www.google.com')
Nokogiri really excels at analyzing this data. Here is a sample code from Railscast :
url = "http://www.walmart.com/search/search-ng.do?search_constraint=0&ic=48_0&search_query=batman&Find.x=0&Find.y=0&Find=Find"
doc = Nokogiri::HTML(open(url))
puts doc.at_css("title").text
doc.css(".item").each do |item|
title = item.at_css(".prodLink").text
price = item.at_css(".PriceCompare .BodyS, .PriceXLBold").text[/\$[0-9\.]+/]
puts "#{title} - #{price}"
puts item.at_css(".prodLink")[:href]
end
source
share