InnerHTML unencodes & lt; in attributes

I have an HTML document that may have &lt; and &gt; in some attributes. I am trying to extract this and run it through XSLT, but the XSLT mechanism errors indicating that < not valid inside the attribute.

I did a bit of work and found that it is properly escaped in the original document, but when it is loaded into the DOM via innerHTML , the DOM does not encode attributes. Oddly enough, he does this for &lt; and &gt; but not for some others like &amp; .

Here is a simple example:

 var div = document.createElement('DIV'); div.innerHTML = '<div asdf="&lt;50" fdsa="&amp;50"></div>'; console.log(div.innerHTML) 

I assume that the DOM implementation has decided that the HTML attributes may be less stringent than the XML attributes, and that this "works as intended." My question is: can I get around this without writing some terrible regex replacement?

+6
source share
3 answers

What worked best for me was to double them using XSLT for the incoming document (and uncheck this on the outgoing document).

So &lt; &amp;lt; becomes in the attribute . Thanks @Abel for the suggestion.

Here is the XSLT I added in case others find it useful:

First, this is a template for performing line replacements in XSLT 1.0. If you can use XSLT 2.0, you can use the built-in replace instead.

 <xsl:template name="string-replace-all"> <xsl:param name="text"/> <xsl:param name="replace"/> <xsl:param name="by"/> <xsl:choose> <xsl:when test="contains($text, $replace)"> <xsl:value-of select="substring-before($text,$replace)"/> <xsl:value-of select="$by"/> <xsl:call-template name="string-replace-all"> <xsl:with-param name="text" select="substring-after($text,$replace)"/> <xsl:with-param name="replace" select="$replace"/> <xsl:with-param name="by" select="$by"/> </xsl:call-template> </xsl:when> <xsl:otherwise> <xsl:value-of select="$text"/> </xsl:otherwise> </xsl:choose> </xsl:template> 

Next is a template that needs specific replacements:

 <!-- xml -> html --> <xsl:template name="replace-html-codes"> <xsl:param name="text"/> <xsl:variable name="lt"> <xsl:call-template name="string-replace-all"> <xsl:with-param name="text" select="$text"/> <xsl:with-param name="replace" select="'&lt;'"/> <xsl:with-param name="by" select="'&amp;lt;'"/> </xsl:call-template> </xsl:variable> <xsl:variable name="gt"> <xsl:call-template name="string-replace-all"> <xsl:with-param name="text" select="$lt"/> <xsl:with-param name="replace" select="'&gt;'"/> <xsl:with-param name="by" select="'&amp;gt;'"/> </xsl:call-template> </xsl:variable> <xsl:value-of select="$gt"/> </xsl:template> <!-- html -> xml --> <xsl:template name="restore-html-codes"> <xsl:param name="text"/> <xsl:variable name="lt"> <xsl:call-template name="string-replace-all"> <xsl:with-param name="text" select="$text"/> <xsl:with-param name="replace" select="'&amp;lt;'"/> <xsl:with-param name="by" select="'&lt;'"/> </xsl:call-template> </xsl:variable> <xsl:variable name="gt"> <xsl:call-template name="string-replace-all"> <xsl:with-param name="text" select="$lt"/> <xsl:with-param name="replace" select="'&amp;gt;'"/> <xsl:with-param name="by" select="'&gt;'"/> </xsl:call-template> </xsl:variable> <xsl:value-of select="$gt"/> </xsl:template> 

XSLT is mostly cross-cutting. I just call the appropriate template when copying the attributes:

 <xsl:template match="@*"> <xsl:attribute name="data-{local-name()}"> <xsl:call-template name="replace-html-codes"> <xsl:with-param name="text" select="."/> </xsl:call-template> </xsl:attribute> </xsl:template> <!-- copy all nodes --> <xsl:template match="node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> 
0
source

Try XMLSerializer:

 var div = document.getElementById('d1'); var pre = document.createElement('pre'); pre.textContent = div.outerHTML; document.body.appendChild(pre); pre = document.createElement('pre'); pre.textContent = new XMLSerializer().serializeToString(div); document.body.appendChild(pre); 
 <div id="d1" data-foo="a &lt; b &amp;&amp; b &gt; c">This is a test</div> 

You may need to adapt XSLT to account for the XMLSerializer insertions of the XHTML namespace (at least here in the test with Firefox).

+2
source

I'm not sure if this is what you are looking for, but take a look.

 var div1 = document.createElement('DIV'); var div2 = document.createElement('DIV'); div1.setAttribute('asdf','&lt;50'); div1.setAttribute('fdsa','&amp;50'); div2.appendChild(div1); console.log(div2.innerHTML.replace(/&amp;/g, '&')); 
0
source

All Articles