Twill and mechanize do not do Javascript, and Qt and Selenium cannot work in App Engine ((1)), which only supports pure Python code. I do not know any pure Python Javascript interpreter that you will need to deploy a JS-enabled scraper in App Engine: - (.
Maybe there is something in Java that will at least allow you to deploy the (Java version) of App Engine? App Engine applications for Java and Python applications can use the same data store, so you can save part of your Python application ... just not the part that Javascript needs to understand. Unfortunately, I don't know enough about the Java / AE environment to offer any particular package to try.
((1)): clarify, since there seems to be a misunderstanding that went so far as to get me started: if you run Selenium or other scrapers on another computer, you can, of course, aim at a deployed site in the Engine application (it doesn't matter, how the website you are aiming for is being deployed, what programming language it uses, etc., etc., if you can access it [[real site: flash, & c, may be different ]]). As I read the question, OP is looking for ways to do scraping as part of the App Engine application - which is the problematic part, not where you (or someone else ;-) launches the site that is being cleaned!
source share