How to extract fragment between bodies body (<body> ... </body>) from AJAX response in JavaScript
The easiest but worst way is to simply hack lines in the response text.
var bodyhtml= html.split('<body>').pop().split('</body>')[0]; This is unsatisfactory in the general case, but it can be feasible if you know the exact format of the returned HTML (for example, there are no attributes in <body> , that the sequences <body> and </body> n’t used in the comment in the middle of the page, etc. .d.).
Another very bad way is to write the entire document in the innerHTML newly created <div> and catch the necessary elements without worrying about writing <html> or <body> inside the <div> broken. You cannot reliably separate child <head> elements from those that are in <body> in this way, but this is what jQuery does.
A more reliable but more painful way would be to use a separate HTML document:
var iframe= document.createElement('iframe'); iframe.style.display= 'none'; document.body.insertBefore(iframe, document.body.firstChild); var idoc= 'contentDocument' in iframe? iframe.contentDocument : iframe.contentWindow.document; idoc.write(htmlpage); idoc.close(); alert(idoc.body.innerHTML); document.body.removeChild(iframe); although it would also execute all the scripts inside the document, potentially changing it so that it would also be inconvenient.
If your HTML page is on the Internet, you can use YQL.
For example, if your page URL is http://xyz.com/page.html and you want everything in the body element to do this
select * from html where url="http://xyz.com/page.html" and xpath='//body' If you are new to YQL, read this http://en.wikipedia.org/wiki/YQL_Page_Scraping
There is also an easy way to do this with the Chromyqlip extension https://chrome.google.com/extensions/detail/bkmllkjbfbeephbldeflbnpclgfbjfmn
Hope this helps you !!!
// Get the XML object for the "body" tag from the XMLHttpRequest/ActiveXObject // object (requestObj). // NOTE: This assumes there is only one "body" tag in your HTML document. var body = requestObj.responseXML.getElementsByTagName("body")[0]; // Get the "body" tag as an XML string. var bodyXML; // for Internet Explorer if (body.xml) { bodyXML = body.xml; } // for every other browser if (typeof (XMLSerializer) != "undefined") { var serializer = new XMLSerializer(); bodyXML = serializer.serializeToString(body); } This gives you the XML for the body tag, like a string. Unfortunately, it still includes "<body>" and "</body>", so if you only want the contents of the tag, you will have to disable them.
You might want to take a look at the second example (“Sample HTML 2 Code”) on this page .