R: Search google for string and return hit count

Is there a way in R to just find Google for something and then return the number of results? I have seen many R packages around some google services (RGoogleDocs, RGoogleData, RGoogleMaps, googleVis), but I can not find this function anywhere.

+6
r
source share
2 answers

This is what I use, but it is based on the API protocol, which eventually fades out. It is also limited, I find 100 searches / day. In the function below, the service is “network”; you need to get the key from http://code.google.com/apis/loader/signup.html (any URL will work).

GetGoogleResults <- function(keyword, service, key) { library(RCurl) library(rjson) base_url <- "http://ajax.googleapis.com/ajax/services/search/" keyword <- gsub(" ", "+", keyword) query <- paste(base_url, service, "?v=1.0&q=", keyword, sep="") if(!is.null(key)) query <- paste(query, "&key=", key, sep="") query <- paste(query, "&start=", 0, sep="") results <- fromJSON(getURL(query)) return(results) } 

Then you can do something like

 google <- GetGoogleResults("searchTerm", "web", yourkey) 

str(google) will tell you the structure of the result. If you just need the number of results, you can use google$responseData$cursor$estimatedResultCount .

As I said, this is based on a protocol that may one day go out of style. In response to Dirk's question, there is an alternative approach using a custom search engine that you can use instead, but it is also limited by the limit (if you want to use the function for this method, you can ping me on noah_at_noahhl.com).

The ultimate, not speed limit, is simply using RCurl to get the page from Google, but it's pretty messy to parse and requires a user agent spoof to get around Google’s efforts so people can't. (I can also share this code, but it breaks whenever Google modifies any of its HTML).

+9
source share

You might want to start with the Google Custom Search API , and then see how much JSON you need to learn to hit :)

There should be enough R infrastructure to do something.

+1
source share

All Articles