IdHttp how to handle page redirects?

Usage: Delphi 2010, the latest version of Indy

I am trying to clear data from the Googles Adsense webpage to get reports. However, so far I have not been successful. It stops after the first request and does not continue.

Using Fiddler to debug traffic / requests on the Google Adsense website and a web browser to load an Adsense page, I see that the request (from a web browser) generates several redirects before the page loads.

However, my Delphi application only generates a few requests before it stops.

Here are the steps I followed:

  • Drop IdHTTP and the IdSSLIOHandlerSocketOpenSSL1 component on the form.
  • Set the IdHTTP component properties of AllowCookies and HandleRedirects to True, and the IOHandler property to IdSSLIOHandlerSocketOpenSSL1.
  • Set the property of the IdSSLIOHandlerSocketOpenSSL1 component. Method: = 'sslvSSLv23'

Finally, I have this code:

procedure TfmMain.GetUrlToFile(AURL, AFile : String); var Output : TMemoryStream; begin Output := TMemoryStream.Create; try IdHTTP1.Get(FURL, Output); Output.SaveToFile(AFile); finally Output.Free; end; end; 

However, it does not get to the login page as expected. I expect it to behave as if it were a web browser, and go through the redirects until you find the last page.

This is the output of the headers from Fiddler:

  HTTP / 1.1 302 Found
 Location: https://encrypted.google.com/
 Cache-control: private
 Content-Type: text / html;  charset = utf-8
 Set-Cookie: PREF = ID = 5166063f01b64b03: FF = 0: TM = 1293571783: LM = 1293571783: S = a5OtsOqxu_GiV3d6;  expires = Thu, 27-Dec-2012 21:29:43 GMT;  path = /;  domain = .google.com
 Set-Cookie: NID = 42 = XFUwZdkyF0TJKmoJjqoGgYNtGyOz-Irvz7ivao2z0 - pCBKPpAvCGUeaa5GXLneP41wlpse-yU5UuC57pBfMkv434t7XB1H68ET0VAVAVAVAZAVAVZVAVZV0VAVZVAVZVAVAVZVAVZVAVZVAVZVAVZVAV5V5V5VVZVAVZVVVZVAVZVVVZVVVZVVVZVVVZVVVZV one gr aJeVA  expires = Wed, 29-Jun-2011 21:29:43 GMT;  path = /;  domain = .google.com;  HttpOnly
 Date: Tue, 28 Dec 2010 21:29:43 GMT
 Server: gws
 Content-Length: 226
 X-XSS-Protection: 1;  mode = block

Firstly, is there something wrong with this outlet?

Is there anything else I have to do to get the IdHTTP component to continue redirecting to the last page?

+6
delphi web-scraping indy
source share
3 answers

The values โ€‹โ€‹of the properties of the IdHTTP component before making a call:

  Name := 'IdHTTP1'; IOHandler := IdSSLIOHandlerSocketOpenSSL1; AllowCookies := True; HandleRedirects := True; RedirectMaximum := 35; Request.UserAgent := 'Mozilla/5.0 (Windows NT 5.1; rv:2.0b8) Gecko/20100101 Firefox/4.' + '0b8'; HTTPOptions := [hoForceEncodeParams]; OnRedirect := IdHTTP1Redirect; CookieManager := IdCookieManager1; 

Event Handler Forwarding:

 procedure TfmMain.IdHTTP1Redirect(Sender: TObject; var dest: string; var NumRedirect: Integer; var Handled: Boolean; var VMethod: string); begin Handled := True; end; 

Making a call:

  FURL := 'https://www.google.com'; GetUrlToFile( (FURL + '/adsense/'), 'a.html'); procedure TfmMain.GetUrlToFile(AURL, AFile : String); var Output : TMemoryStream; begin Output := TMemoryStream.Create; try try IdHTTP1.Get(AURL, Output); IdHTTP1.Disconnect; except end; Output.SaveToFile(AFile); finally Output.Free; end; end; 

:

Here (request and response headers) are derived from Fiddler:

alt text

+7
source

Receive redirects

TIdHTTP.HandleRedirects := True so that it automatically starts forwarding.

TIdHTTP.RedirectMaximum used to determine the number of consecutive redirects.


Alternatively, you can assign TIdHTTP.OnRedirect and set Handled := True from this handler. This is what I am doing in a project that is supposed to read data from the WikiMedia website (my own website).

About the HTTP response

Nothing wrong with this answer, it's a very simple redirect to https://encrypted.google.com/ . TIdHTTP should go to the given page in response. It also sets some cookies.

Other offers

Remember to assign a CookieManager and make sure that you use the same CookieManager for all subsequent requests. If you do not, you will probably be redirected to the login page again and again.

+1
source

In my case, I needed to fix dest, because somehow I had; in him!

 procedure Tfrm1.IdHTTP1Redirect(Sender: TObject; var dest: string; var NumRedirect: Integer; var Handled: Boolean; var VMethod: string); var i: Integer; begin i := Pos(';', dest); if i > 0 then begin dest := Copy(dest,1, i - 1); end; Handled := True; end; 
0
source

All Articles