Usage: Delphi 2010, the latest version of Indy
I am trying to clear data from the Googles Adsense webpage to get reports. However, so far I have not been successful. It stops after the first request and does not continue.
Using Fiddler to debug traffic / requests on the Google Adsense website and a web browser to load an Adsense page, I see that the request (from a web browser) generates several redirects before the page loads.
However, my Delphi application only generates a few requests before it stops.
Here are the steps I followed:
- Drop IdHTTP and the IdSSLIOHandlerSocketOpenSSL1 component on the form.
- Set the IdHTTP component properties of AllowCookies and HandleRedirects to True, and the IOHandler property to IdSSLIOHandlerSocketOpenSSL1.
- Set the property of the IdSSLIOHandlerSocketOpenSSL1 component. Method: = 'sslvSSLv23'
Finally, I have this code:
procedure TfmMain.GetUrlToFile(AURL, AFile : String); var Output : TMemoryStream; begin Output := TMemoryStream.Create; try IdHTTP1.Get(FURL, Output); Output.SaveToFile(AFile); finally Output.Free; end; end;
However, it does not get to the login page as expected. I expect it to behave as if it were a web browser, and go through the redirects until you find the last page.
This is the output of the headers from Fiddler:
HTTP / 1.1 302 Found
Location: https://encrypted.google.com/
Cache-control: private
Content-Type: text / html; charset = utf-8
Set-Cookie: PREF = ID = 5166063f01b64b03: FF = 0: TM = 1293571783: LM = 1293571783: S = a5OtsOqxu_GiV3d6; expires = Thu, 27-Dec-2012 21:29:43 GMT; path = /; domain = .google.com
Set-Cookie: NID = 42 = XFUwZdkyF0TJKmoJjqoGgYNtGyOz-Irvz7ivao2z0 - pCBKPpAvCGUeaa5GXLneP41wlpse-yU5UuC57pBfMkv434t7XB1H68ET0VAVAVAVAZAVAVZVAVZV0VAVZVAVZVAVAVZVAVZVAVZVAVZVAVZVAV5V5V5VVZVAVZVVVZVAVZVVVZVVVZVVVZVVVZVVVZV one gr aJeVA expires = Wed, 29-Jun-2011 21:29:43 GMT; path = /; domain = .google.com; HttpOnly
Date: Tue, 28 Dec 2010 21:29:43 GMT
Server: gws
Content-Length: 226
X-XSS-Protection: 1; mode = block
Firstly, is there something wrong with this outlet?
Is there anything else I have to do to get the IdHTTP component to continue redirecting to the last page?
delphi web-scraping indy
Stevel
source share