I want to create a web crawler in Groovy (using the Grails database and the MongoDB database) that can crawl a site by creating a list of site URLs and their resource types, their contents, response time and the number of redirects.
I am discussing JSoup vs Crawler4j. I read about what they mostly do, but I cannot clearly understand the difference between them. Can anyone suggest which one is better for the above features? Or is it absolutely wrong to compare these two?
Thanks.
web-crawler jsoup crawler4j
clever_bassi
source share