My current project involves collecting textual content from an element and all its descendants based on the provided selector.
For example, when the #content selector is #content , it runs against this HTML:
<div id="content"> <p>This is some text.</p> <script type="text/javascript"> var test = true; </script> <p>This is some more text.</p> </div>
my script will return (after a little clearing of the spaces):
This is the text. var test = true; This is another text.
However, I need to ignore the text nodes that occur inside the <script> elements.
This is a snippet of my current code (technically, it matches one or more provided selectors):
// get text content of all matching elements for (x = 0; x < selectors.length; x++) { // 'selectors' is an array of CSS selectors from which to gather text content matches = Sizzle(selectors[x], document); for (y = 0; y < matches.length; y++) { match = matches[y]; if (match.innerText) { // IE content += match.innerText + ' '; } else if (match.textContent) { // other browsers content += match.textContent + ' '; } } }
This is a little too simplistic, as it simply returns all the text nodes inside the element (and its descendants) that matches the provided selector. The solution I'm looking for will return all text nodes except those that are in the <script> elements. It doesn't have to be particularly high-performance, but I need it to end up being compatible with multiple browsers.
I assume that I need to somehow sort through all the child elements of the element that correspond to the selector, and copy all text nodes other than those inside the <script> elements; this does not look like a way to identify JavaScript after it has already flipped to a string accumulated from all text nodes.
I cannot use jQuery (for performance / bandwidth reasons), although you may have noticed that I use its Sizzle selection mechanism, so the jQuery selection logic is available.
Thanks in advance for your help!
javascript string dom text textnode
Bungle
source share