Saving attribute space

Disclaimer: The following is against sin against XML. This is why I am trying to change it using XSLT :)

Now my XML looks like this:

<root> <object name="blarg" property1="shablarg" property2="werg".../> <object name="yetanotherobject" .../> </root> 

Yes, I put all the text data in the attributes. I hope XSLT can save me; I want to move to something like this:

 <root> <object> <name>blarg</name> <property1>shablarg</name> ... </object> <object> ... </object> </root> 

It actually works for me, except that my sins against XML were more ... exceptional. Some tags look like this:

 <object description = "This is the first line This is the third line. That second line full of whitespace is meaningful"/> 

I use xsltproc under linux, but it has no options for saving spaces. I tried using xsl: preserve-space and xml: space = "save" to no avail. It seems that every parameter found is used to store spaces inside the elements themselves, but not for attributes. Each time the above changes:

 This is the first line This is the third line.  That second line full of whitespace is meaningful

So the question is, can I keep the attribute space?

+10
xml xslt whitespace
Nov 04 '08 at 0:22
source share
4 answers

This is actually a raw XML parsing issue, not something XSLT can help you with. The XML syntax should convert newlines into this attribute value to spaces according to "3.3.3 Normalizing Attributes in the XML Standard. So, anything that reads your description attributes and stores newline characters does it wrong.

You may be able to recover newline characters by pre-processing the XML to avoid newlines and # 10; symbolic links, if you also do not have newlines in which charrefs are forbidden, for example, inside tags. Charrefs should survive as control characters until the attribute value, where you can then turn them into text nodes.

+5
Nov 04 '08 at 0:42
source share

According to the annotated XML specification, the gap in the attribute values ​​is normalized by the XML processor (see annotation (T) at 3.3 0.3). So it seems like the answer is probably no.

+3
Nov 04 '08 at 0:35
source share

As others noted, the XML specification does not allow spaces to be stored in attributes. In fact, this is one of the few differences between what you can do with attributes and elements (the other thing is that elements can contain other tags, but attributes cannot).

First you need to process the file outside of XML to save spaces.

+1
Nov 04 '08 at 3:28
source share

If you can control your XML processor, you can do it.

From my other answer (which is linked by many links):

if you have XML like

 <?xml version="1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE elemke [ <!ATTLIST brush wood CDATA #REQUIRED> ]> <elemke> <brush wood="guy&#xA;threep"/> </elemke> 

and XSL for example

 <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template name="split"> <xsl:param name="list" select="''" /> <xsl:param name="separator" select="'&#xA;'" /> <xsl:if test="not($list = '' or $separator = '')"> <xsl:variable name="head" select="substring-before(concat($list, $separator), $separator)" /> <xsl:variable name="tail" select="substring-after($list, $separator)" /> <xsl:value-of select="$head"/> <br/><xsl:text>&#xA;</xsl:text> <xsl:call-template name="split"> <xsl:with-param name="list" select="$tail" /> <xsl:with-param name="separator" select="$separator" /> </xsl:call-template> </xsl:if> </xsl:template> <xsl:template match="brush"> <html> <xsl:call-template name="split"> <xsl:with-param name="list" select="@wood"/> </xsl:call-template> </html> </xsl:template> </xsl:stylesheet> 

you can get html like:

 <html>guy<br> threep<br> </html> 

as checked / created with a processor like saxon command line:

 java -jar saxon9he.jar -s:in.xml -xsl:in.xsl -o:out.html 
0
Apr 21 '15 at 20:19
source share



All Articles