Getting URLs for first google search results in shell script

It is relatively easy to parse AJAX API output using a scripting language:

#!/usr/bin/env python import urllib import json base = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&' query = urllib.urlencode({'q' : "something"}) response = urllib.urlopen(base + query).read() data = json.loads(response) print data['responseData']['results'][0]['url'] 

But are there any more efficient ways to do something similar only with basic shell scripts? If you just minimized the API page, how should you encode URL parameters or parse JSON?

+7
source share
6 answers

I ended up using the curl parameter --data-urlencode to encode the query parameter and just sed to retrieve the first result.

curl -s --get --data-urlencode "q=example" http://ajax.googleapis.com/ajax/services/search/web?v=1.0 | sed 's/"unescapedUrl":"\([^"]*\).*/\1/;s/.*GwebSearch",//'

+6
source

@Lri - here is the script I personally use command line tools and scripts for my purposes. It uses the lynx command-line utility to reset URLs. The script can be downloaded from HERE , and the code HERE . Here is the code for your link,

 #!/bin/bash clear echo "" echo ".=========================================================." echo "| |" echo "| COMMAND LINE GOOGLE SEARCH |" echo "| --------------------------------------------------- |" echo "| |" echo "| Version: 1.0 |" echo "| Developed by: Rishi Narang |" echo "| Blog: www.wtfuzz.com |" echo "| |" echo "| Usage: ./gocmd.sh <search strings> |" echo "| Example: ./gocmd.sh example and test |" echo "| |" echo ".=========================================================." echo "" if [ -z $1 ] then echo "ERROR: No search string supplied." echo "USAGE: ./gocmd.sh <search srting>" echo "" echo -n "Anyways for now, supply the search string here: " read SEARCH else SEARCH=$@ fi URL="http://google.com/search?hl=en&safe=off&q=" STRING=`echo $SEARCH | sed 's/ /%20/g'` URI="$URL%22$STRING%22" lynx -dump $URI > gone.tmp sed 's/http/\^http/g' gone.tmp | tr -s "^" "\n" | grep http| sed 's/\ .*//g' > gtwo.tmp rm gone.tmp sed '/google.com/d' gtwo.tmp > urls rm gtwo.tmp echo "SUCCESS: Extracted `wc -l urls` and listed them in '`pwd`/urls' file for reference." echo "" cat urls echo "" #EOF 
+4
source

Unconfirmed approach as I don't have access to the unix block at the moment ...

Assuming the “test” is a query string, you can use simple wget at the following URL http://www.google.co.in/#hl=en&source=hp&biw=1280&bih=705&q=test&btnI=Google+Search&aq= f & aqi = g10 & aql = & oq = test & fp = 3cc29334ffc8c2c

This will allow you to use Google’s “Lucky Me” functionality and launch the first URL for you. You may also be able to clear the above URL.

+1
source

Lri's answer only returns the last result for me, and I need the top, so I changed it to:

 JSON=$(curl -s --get --data-urlencode "q=QUERY STRING HERE" http://ajax.googleapis.com/ajax/services/search/web?v=1.0 | python -mjson.tool) response=$(echo "$JSON" | sed -n -e 's/^.*responseStatus\": //p') if [ $response -eq 200 ] ; then url=$(echo "$JSON" | egrep "unescapedUrl" | sed -e '1!d' -e "s/^.*unescapedUrl\": \"//" -e "s/\".*$//") echo "Success! [$url]" wget $url; else echo "FAILED! [$response]" fi 

It is not as compact as we would like, but in a hurry.

+1
source

for reference, by November 2013 you will need to completely replace "ajax.googleapis.com/ajax/services/search/web". Most likely, you will have to replace it with CSE (Custom Search Engine). The problem is that you cannot get “global” results from the CSE. Here is some good advice on how to do this http://groups.google.com/a/googleproductforums.com/d/msg/customsearch/0aoS-bXgnEM/lwlZ6_IyVDQJ

0
source

after many years you can install googler

googler -n 1 -c us -l en search something here --json

you can control the number of output pages using the n flag.

To get only the url, just do it:

 grep "\"url\""|tr -s ' ' |cut -d ' ' -f3|tr -d "\"" 
0
source

All Articles