I am trying to create a simple utility in Node with zombie.js to visit the page, find and open all the links on the page and make sure that each child page successfully returns 200.
Here is an example of this code (written in CoffeeScript), bypassing the stackoverflow.com homepage
Browser = require('zombie') browserOpts = runScripts: false site: 'http://www.stackoverflow.com' home = new Browser browserOpts home.visit '/', (e, browser) -> questions = browser.queryAll '#question-mini-list .summary h3 a' for q in questions qUrl = q.getAttribute 'href' page = new Browser browserOpts page.visit qUrl, (e, browser, statusCode, errors) -> console.log "Arrived at page #{browser.window.location} and found " + browser.html().length + " bytes" console.log statusCode browser.dump() return return
If you try to run this code, you will notice that the first part of the links is loaded correctly, and the number of bytes on the page is displayed.
However, after the first batch of successful page loads (the size of which seems random), all subsequent page loads seem to end the visit callback prematurely. The document is empty (it's just <html><head></head><body></body></html> ), and the statusCode argument for the callback is undefined .
I cannot explain or understand why this is happening. Any advice would be greatly appreciated.
source share