Skip specific Javascript execution in HTML block

I have a URL. I want to get the URL source page after executing Java scripts.

Fetching page source using HtmlUnit: URL stuck

Initially, I suspected that it was due to the system resource and high CPU usage, that the URL was stuck.

Then I tried to run it in HTML UNIT 2.9 and 2.11. He was stuck during parsing. Refer to the question above about clearing HTML UNIT code that is stuck .

Now I suspect that this may be due to the fact that JS Execution goes into an infinite loop.

I want to check which JS files are causing problems and remove them from execution.

If they are JS for sites like Google Analytics, Twitter, etc., I may not need them at all.

So, I want to find a way to tell the HTML unit to ignore a specific JS file and execute the rest.

Does anyone know how to do this?

+6
source share
1 answer

Try it. This worked for me:

class InterceptWebConnection extends FalsifyingWebConnection{ public InterceptWebConnection(WebClient webClient) throws IllegalArgumentException{ super(webClient); } @Override public WebResponse getResponse(WebRequest request) throws IOException { WebResponse response=super.getResponse(request); if(response.getWebRequest().getUrl().toString().endsWith("dom-drag.js")){ return createWebResponse(response.getWebRequest(), "", "application/javascript", 200, "Ok"); } return super.getResponse(request); } } 

then write the following when setting up webClient

 new InterceptWebConnection(webClient); 
+5
source

All Articles