C # - Get the value of a JavaScript variable using HTMLAgilityPack

I currently have 2 JavaScript variables in which I need to get the values. HTML consists of a series of nested DIVs without id / name attributes. Is it possible to extract data from these variables using HTMLAgilityPack? If so, how will I do this, if not what it takes, regular expressions? If the latter, please help me create a regex that will allow me to do this. Thanks.

<div style="margin: 12px 0px;" align="left"> <script type="text/javascript"> variable1 = "var1"; variable2 = "var2"; </script> </div> 
+4
source share
1 answer

I assume that you are trying to clear this information from a website? Most likely you have no direct control? There are several ways to do this, I will go easy and hard (at least as I see them):

  • Ask the owner (on the site). Most of the time they can give you direct access to information, and if you ask beautifully, they can just give you free

  • You can use webBrowser , run javascript and then parse the values ​​from the DOM afterwards. Unlike HttpWebRequest, this allows you to load all the necessary values ​​on the page and clear it. Useful link here.

  • Steal the source of Firebug. Browse the website with Firebug to see which URLs are being called from the background. Most likely, it uses an asynchronous request to receive updated information from a web service. Using Firebug, you can view this in NET β†’ XHR. Look at the query and return values, then you can extract the values ​​yourself and analyze the contents from the source, rather than clear the page.

I think this may be the information you were looking for, but if you do not tell me, and I can clarify / correct the answer

+3
source

All Articles