Looking for the recommendation of a good best practice tutorial for a web scraping project?

I need to do a fairly extensive project using web scraping, and I am considering using Hpricot or Beautiful Soup (i.e. Ruby or Python). Has anyone come across a tutorial that they thought was particularly good at this, which would help me launch a project on my right foot?

+5
source share
6 answers

ScrAPI has a great episode of Railscasts .

+2
source

- Python: Scrapy Mechanize. .

+9

, , - , Webbots, Spiders Screen Scrapers.

: , - . , , , . , , . Theres , , , , .

, -, PHP. , , .

+5

lxml BeautifulSoup. , HTML. , , BeautifulSoup, "" HTML , BeautifulSoup ( - lxml ). API BeautifulSoup, API- lxml.

Ian Blicking .

BeautifulSoup , Google App Engine - , , Python.

+4

Ruby Scrubyt -. , , - .

0

All Articles