Getting URLs for first google search results in shell script

Question

Getting URLs for first google search results in shell script

It is relatively easy to parse AJAX API output using a scripting language:

#!/usr/bin/env python import urllib import json base = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&' query = urllib.urlencode({'q' : "something"}) response = urllib.urlopen(base + query).read() data = json.loads(response) print data['responseData']['results'][0]['url']

But are there any more efficient ways to do something similar only with basic shell scripts? If you just minimized the API page, how should you encode URL parameters or parse JSON?

+7

bash

user495470 Mar 31 '11 at 21:27

source share

6 answers

@Lri - here is the script I personally use command line tools and scripts for my purposes. It uses the lynx command-line utility to reset URLs. The script can be downloaded from HERE , and the code HERE . Here is the code for your link,

 #!/bin/bash clear echo "" echo ".=========================================================." echo "| |" echo "| COMMAND LINE GOOGLE SEARCH |" echo "| --------------------------------------------------- |" echo "| |" echo "| Version: 1.0 |" echo "| Developed by: Rishi Narang |" echo "| Blog: www.wtfuzz.com |" echo "| |" echo "| Usage: ./gocmd.sh <search strings> |" echo "| Example: ./gocmd.sh example and test |" echo "| |" echo ".=========================================================." echo "" if [ -z $1 ] then echo "ERROR: No search string supplied." echo "USAGE: ./gocmd.sh <search srting>" echo "" echo -n "Anyways for now, supply the search string here: " read SEARCH else SEARCH=$@ fi URL="http://google.com/search?hl=en&safe=off&q=" STRING=`echo $SEARCH | sed 's/ /%20/g'` URI="$URL%22$STRING%22" lynx -dump $URI > gone.tmp sed 's/http/\^http/g' gone.tmp | tr -s "^" "\n" | grep http| sed 's/\ .*//g' > gtwo.tmp rm gone.tmp sed '/google.com/d' gtwo.tmp > urls rm gtwo.tmp echo "SUCCESS: Extracted `wc -l urls` and listed them in '`pwd`/urls' file for reference." echo "" cat urls echo "" #EOF

+4

rn May 04 '11 at 18:46

source share

Unconfirmed approach as I don't have access to the unix block at the moment ...

Assuming the “test” is a query string, you can use simple wget at the following URL http://www.google.co.in/#hl=en&source=hp&biw=1280&bih=705&q=test&btnI=Google+Search&aq= f & aqi = g10 & aql = & oq = test & fp = 3cc29334ffc8c2c

This will allow you to use Google’s “Lucky Me” functionality and launch the first URL for you. You may also be able to clear the above URL.

+1

qwerty Apr 1 '11 at 18:43

source share

Lri's answer only returns the last result for me, and I need the top, so I changed it to:

 JSON=$(curl -s --get --data-urlencode "q=QUERY STRING HERE" http://ajax.googleapis.com/ajax/services/search/web?v=1.0 | python -mjson.tool) response=$(echo "$JSON" | sed -n -e 's/^.*responseStatus\": //p') if [ $response -eq 200 ] ; then url=$(echo "$JSON" | egrep "unescapedUrl" | sed -e '1!d' -e "s/^.*unescapedUrl\": \"//" -e "s/\".*$//") echo "Success! [$url]" wget $url; else echo "FAILED! [$response]" fi

It is not as compact as we would like, but in a hurry.

+1

katbyte Oct 12 '11 at 6:30

source share

for reference, by November 2013 you will need to completely replace "ajax.googleapis.com/ajax/services/search/web". Most likely, you will have to replace it with CSE (Custom Search Engine). The problem is that you cannot get “global” results from the CSE. Here is some good advice on how to do this http://groups.google.com/a/googleproductforums.com/d/msg/customsearch/0aoS-bXgnEM/lwlZ6_IyVDQJ

0

neverlastn Apr 11 '12 at 11:26

source share

after many years you can install googler

googler -n 1 -c us -l en search something here --json

you can control the number of output pages using the n flag.

To get only the url, just do it:

 grep "\"url\""|tr -s ' ' |cut -d ' ' -f3|tr -d "\""

0

once Oct 26 '17 at 10:19

source share

user495470 · Accepted Answer · 2011-09-27T18:20:20+0000

I ended up using the curl parameter --data-urlencode to encode the query parameter and just sed to retrieve the first result.

curl -s --get --data-urlencode "q=example" http://ajax.googleapis.com/ajax/services/search/web?v=1.0 | sed 's/"unescapedUrl":"\([^"]*\).*/\1/;s/.*GwebSearch",//'

Getting URLs for first google search results in shell script

More articles: