How to make mechanization not using the forms on this page?

Question

How to make mechanization not using the forms on this page?

import mechanize url = 'http://steamcommunity.com' br=mechanize.Browser(factory=mechanize.RobustFactory()) br.open(url) print br.request print br.form for each in br.forms(): print each print

The above code results in:

 Traceback (most recent call last): File "./mech_test.py", line 12, in <module> for each in br.forms(): File "build/bdist.linux-i686/egg/mechanize/_mechanize.py", line 426, in forms File "build/bdist.linux-i686/egg/mechanize/_html.py", line 559, in forms File "build/bdist.linux-i686/egg/mechanize/_html.py", line 228, in forms mechanize._html.ParseError

My specific purpose is to use the login form, but I cannot even mechanize to understand that there are any forms. Even using what, in my opinion, is the most basic method of selecting any shape, br.select_form(nr=0) , leads to the same trace. The enctype form is multipart / form-data, if that matters.

I think it all comes down to a two-part question: how can I get the mechanization to work with this page, or if this is not possible, in what other way do I save cookies?

edit: As indicated below, this redirects to ' https://steamcommunity.com '.

The engine can successfully recover HTML, as can be seen from the following code:

 url = 'https://steamcommunity.com' hh = mechanize.HTTPSHandler() # you might want HTTPSHandler, too hh.set_http_debuglevel(1) opener = mechanize.build_opener(hh) response = opener.open(url) contents = response.readlines() print contents

+4

python automation screen-scraping mechanize

Dustin wyatt May 28, '09 at 17:25

source share

2 answers

Boris Guéry · Answer 1 · 2009-05-28T17:38:11+0000

Did you mention that the site is being redirected to the https (ssl) server?

Well, try installing the new HTTPS handler as follows:

 mechanize.HTTPSHandler()

Yuda prawira · Answer 2 · 2011-05-07T12:11:30+0000

Use this secret, I'm sure this is work for you;)

 br = mechanize.Browser(factory=mechanize.DefaultFactory(i_want_broken_xhtml_support=True))

How to make mechanization not using the forms on this page?

More articles: