Using VBA in Excel in Google Search in IE and returning a hyperlink to the first result

I am trying to use IE Automation to search string text in Google in Excel. I want to return a hyperlink to the site of the first result in another cell in excel. Is it possible? I have a list of 60,000 entries that I need to search on Google, and return the website hyperlink to the first result. Is there any other approach to this that you would recommend? I appreciate the help in advance.

+8
vba excel msxml
source share
2 answers

For 60,000 entries, I recommend using the xmlHTTP object instead of using IE.
HTTP requests are simpler and much faster.

Download the sample file here.

Sub XMLHTTP() Dim url As String, lastRow As Long Dim XMLHTTP As Object, html As Object, objResultDiv As Object, objH3 As Object, link As Object Dim start_time As Date Dim end_time As Date lastRow = Range("A" & Rows.Count).End(xlUp).Row Dim cookie As String Dim result_cookie As String start_time = Time Debug.Print "start_time:" & start_time For i = 2 To lastRow url = "https://www.google.co.in/search?q=" & Cells(i, 1) & "&rnd=" & WorksheetFunction.RandBetween(1, 10000) Set XMLHTTP = CreateObject("MSXML2.serverXMLHTTP") XMLHTTP.Open "GET", url, False XMLHTTP.setRequestHeader "Content-Type", "text/xml" XMLHTTP.setRequestHeader "User-Agent", "Mozilla/5.0 (Windows NT 6.1; rv:25.0) Gecko/20100101 Firefox/25.0" XMLHTTP.send Set html = CreateObject("htmlfile") html.body.innerHTML = XMLHTTP.ResponseText Set objResultDiv = html.getelementbyid("rso") Set objH3 = objResultDiv.getelementsbytagname("H3")(0) Set link = objH3.getelementsbytagname("a")(0) str_text = Replace(link.innerHTML, "<EM>", "") str_text = Replace(str_text, "</EM>", "") Cells(i, 2) = str_text Cells(i, 3) = link.href DoEvents Next end_time = Time Debug.Print "end_time:" & end_time Debug.Print "done" & "Time taken : " & DateDiff("n", start_time, end_time) MsgBox "done" & "Time taken : " & DateDiff("n", start_time, end_time) End Sub 

enter image description here

NTN
Santos

+18
source share

Links seem to be inside H3 tags. Typically, to verify page load, you may need the following:

 Private Declare Sub Sleep Lib "kernel32" (ByVal nMilliseconds As Long) Sub UseIE() Dim ie As Object Dim thePage As Object Dim strTextOfPage As String Set ie = CreateObject("InternetExplorer.Application") 'ie.FullScreen = True With ie '.Visible = True .Navigate "http://www.bbc.co.uk" While Not .ReadyState = READYSTATE_COMPLETE '4 Sleep 500 'wait 1/2 sec before trying again Wend End With Set thePage = ie.Document 'more code here End Sub 

However, I would instead try to reference element A in the first H3 using getElementsByTagName("H3") , get the first of these elements, and then look inside that for the A-link and its href attribute.

In JavaScript, attempts to reference non-existent elements return undefined , but from VBA it will probably need error handling code.

As soon as I got href , I would stop navigation (not sure about the command for this, perhaps ie.Stop ) or go straight to the next page.

However, the first link (s) will often be supported by the links, and the returned href is slightly distorted. The text of these sponsored links includes em tags. I can use this information to drop these links and look further down the page.

I do not know if there is a better way to do this.

0
source share

All Articles