How to prevent duplication in XSL?

How to prevent duplicate entries in a list, and then, ideally, sort this list? What I am doing is when information at one level is missing, taking information from a level below it to create a missing list at a higher level. I currently have an XML similar to this:

<c03 id="ref6488" level="file"> <did> <unittitle>Clinic Building</unittitle> <unitdate era="ce" calendar="gregorian">1947</unitdate> </did> <c04 id="ref34582" level="file"> <did> <container label="Box" type="Box">156</container> <container label="Folder" type="Folder">3</container> </did> </c04> <c04 id="ref6540" level="file"> <did> <container label="Box" type="Box">156</container> <unittitle>Contact prints</unittitle> </did> </c04> <c04 id="ref6606" level="file"> <did> <container label="Box" type="Box">154</container> <unittitle>Negatives</unittitle> </did> </c04> </c03> 

Then apply the following XSL:

 <xsl:template match="c03/did"> <xsl:choose> <xsl:when test="not(container)"> <did> <!-- If no c03 container item is found, look in the c04 level for one --> <xsl:if test="../c04/did/container"> <!-- If a c04 container item is found, use the info to build a c03 version --> <!-- Skip c03 container item, if still no c04 items found --> <container label="Box" type="Box"> <!-- Build container list --> <!-- Test for more than one item, and if so, list them, --> <!-- separated by commas and a space --> <xsl:for-each select="../c04/did"> <xsl:if test="position() &gt; 1">, </xsl:if> <xsl:value-of select="container"/> </xsl:for-each> </container> </did> </xsl:when> <!-- If there is a c03 container item(s), list it normally --> <xsl:otherwise> <xsl:copy-of select="."/> </xsl:otherwise> </xsl:choose> </xsl:template> 

But I get the result of the "container"

 <container label="Box" type="Box">156, 156, 154</container> 

when I want,

 <container label="Box" type="Box">154, 156</container> 

Below is the full result that I am trying to get:

 <c03 id="ref6488" level="file"> <did> <container label="Box" type="Box">154, 156</container> <unittitle>Clinic Building</unittitle> <unitdate era="ce" calendar="gregorian">1947</unitdate> </did> <c04 id="ref34582" level="file"> <did> <container label="Box" type="Box">156</container> <container label="Folder" type="Folder">3</container> </did> </c04> <c04 id="ref6540" level="file"> <did> <container label="Box" type="Box">156</container> <unittitle>Contact prints</unittitle> </did> </c04> <c04 id="ref6606" level="file"> <did> <container label="Box" type="Box">154</container> <unittitle>Negatives</unittitle> </did> </c04> </c03> 

Thanks in advance for your help!

+7
duplicates xslt
source share
6 answers

Try using the following code:

 <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:output indent="yes"></xsl:output> <xsl:template match="node() | @*"> <xsl:copy> <xsl:apply-templates select="node() | @*"/> </xsl:copy> </xsl:template> <xsl:template match="c03/did"> <xsl:choose> <xsl:when test="not(container)"> <did> <!-- If no c03 container item is found, look in the c04 level for one --> <xsl:if test="../c04/did/container"> <xsl:variable name="foo" select="../c04/did/container[@type='Box']/text()"/> <!-- If a c04 container item is found, use the info to build a c03 version --> <!-- Skip c03 container item, if still no c04 items found --> <container label="Box" type="Box"> <!-- Build container list --> <!-- Test for more than one item, and if so, list them, --> <!-- separated by commas and a space --> <xsl:for-each select="distinct-values($foo)"> <xsl:sort /> <xsl:if test="position() &gt; 1">, </xsl:if> <xsl:value-of select="." /> </xsl:for-each> </container> <xsl:apply-templates select="*" /> </xsl:if> </did> </xsl:when> <!-- If there is a c03 container item(s), list it normally --> <xsl:otherwise> <xsl:copy-of select="."/> </xsl:otherwise> </xsl:choose> </xsl:template> </xsl:stylesheet> 

It looks pretty much what you want:

 <?xml version="1.0" encoding="UTF-8"?> <c03 id="ref6488" level="file"> <did> <container label="Box" type="Box">154, 156</container> <unittitle>Clinic Building</unittitle> <unitdate era="ce" calendar="gregorian">1947</unitdate> </did> <c04 id="ref34582" level="file"> <did> <container label="Box" type="Box">156</container> <container label="Folder" type="Folder">3</container> </did> </c04> <c04 id="ref6540" level="file"> <did> <container label="Box" type="Box">156</container> <unittitle>Contact prints</unittitle> </did> </c04> <c04 id="ref6606" level="file"> <did> <container label="Box" type="Box">154</container> <unittitle>Negatives</unittitle> </did> </c04> </c03> 

The trick is to use <xsl:sort> and distinct-values() together. See (IMHO) a great book by Michael Kluh "XSLT 2.0 and XPATH 2.0"

+1
source share

No XSLT 2.0 solution is needed for this problem .

Here is the XSLT 1.0 solution, which is more compact than the XSLT 2.0 solution currently selected (35 lines versus 43 lines):

 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:strip-space elements="*"/> <xsl:key name="kBoxContainerByVal" match="container[@type='Box']" use="."/> <xsl:template match="node()|@*"> <xsl:copy> <xsl:apply-templates select="node()|@*"/> </xsl:copy> </xsl:template> <xsl:template match="c03/did[not(container)]"> <xsl:copy> <xsl:variable name="vContDistinctValues" select= "/*/*/*/container[@type='Box'] [generate-id() = generate-id(key('kBoxContainerByVal', .)[1]) ] "/> <container label="Box" type="Box"> <xsl:for-each select="$vContDistinctValues"> <xsl:sort data-type="number"/> <xsl:value-of select= "concat(., substring(', ', 1 + 2*(position() = last())))"/> </xsl:for-each> </container> <xsl:apply-templates/> </xsl:copy> </xsl:template> </xsl:stylesheet> 

When this transformation is applied to the originally provided XML document, the correct, desired result is obtained :

 <c03 id="ref6488" level="file"> <did> <container label="Box" type="Box">156, 154</container> <unittitle>Clinic Building</unittitle> <unitdate era="ce" calendar="gregorian">1947</unitdate> </did> <c04 id="ref34582" level="file"> <did> <container label="Box" type="Box">156</container> <container label="Folder" type="Folder">3</container> </did> </c04> <c04 id="ref6540" level="file"> <did> <container label="Box" type="Box">156</container> <unittitle>Contact prints</unittitle> </did> </c04> <c04 id="ref6606" level="file"> <did> <container label="Box" type="Box">154</container> <unittitle>Negatives</unittitle> </did> </c04> </c03> 

Update:

I did not notice the requirement for sorting container numbers. Now the decision reflects this.

+2
source share

try using the key group in xslt, here is an article about the Muenchian method, which should help eliminate duplicates. http://www.jenitennison.com/xslt/grouping/muenchian.html

+1
source share

A bit shorter version of XSLT 2.0, combining approaches with other answers. Note that the sorting is alphabetical, so if the labels "54" and "156" are found, the output will be "156, 54". If numerical sorting is required, use <xsl:sort select="number(.)"/> Instead of <xsl:sort/> .

 <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:strip-space elements="*"/> <xsl:template match="node()|@*"> <xsl:copy> <xsl:apply-templates select="node()|@*"/> </xsl:copy> </xsl:template> <xsl:template match="c03/did[not(container)]"> <xsl:variable name="containers" select="../c04/did/container[@label='Box'][text()]"/> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:if test="$containers"> <container label="Box" type="Box"> <xsl:for-each select="distinct-values($containers)"> <xsl:sort/> <xsl:if test="position() != 1">, </xsl:if> <xsl:value-of select="."/> </xsl:for-each> </container> </xsl:if> <xsl:apply-templates select="node()"/> </xsl:copy> </xsl:template> </xsl:stylesheet> 
+1
source share

The actual XSLT 2.0 solution is also quite short :

 <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="xs" > <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:template match="node()|@*"> <xsl:copy> <xsl:apply-templates select="node()|@*"/> </xsl:copy> </xsl:template> <xsl:template match="c03/did[not(container)]"> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:variable name="vContDistinctValues" as="xs:integer*"> <xsl:perform-sort select= "distinct-values(/*/*/*/container[@type='Box']/text()/xs:integer(.))"> <xsl:sort/> </xsl:perform-sort> </xsl:variable> <xsl:if test="$vContDistinctValues"> <container label="Box" type="Box"> <xsl:value-of select="$vContDistinctValues" separator=","/> </container> </xsl:if> <xsl:apply-templates/> </xsl:copy> </xsl:template> </xsl:stylesheet> 

Note:

  • Using types eliminates the need to specify data-type in <xsl:sort/> .

  • Using the separator <xsl:value-of/> attribute

+1
source share

The following XSLT 1.0 conversion does what you are looking for

 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" > <xsl:output encoding="utf-8" /> <!-- key to index containers by these three distinct qualities: 1: their ancestor <c??> node (represented as its unique ID) 2: their @type attribute value 3: their node value (ie their text) --> <xsl:key name = "kContainer" match = "container" use = "concat(generate-id(../../..), '|', @type, '|', .)" /> <!-- identity template to copy everything as is by default --> <xsl:template match="node()|@*"> <xsl:copy> <xsl:apply-templates select="node()|@*" /> </xsl:copy> </xsl:template> <!-- special template for <did>s without a <container> child --> <xsl:template match="did[not(container)]"> <xsl:copy> <xsl:copy-of select="@*" /> <container label="Box" type="Box"> <!-- from subordinate <container>s of type Box, use the ones that are *the first* to have that certain combination of the three distinct qualities mentioned above --> <xsl:apply-templates mode="list-values" select=" ../*/did/container[@type='Box'][ generate-id() = generate-id( key( 'kContainer', concat(generate-id(../../..), '|', @type, '|', .) )[1] ) ] "> <!-- sort them by their node value --> <xsl:sort select="." data-type="number" /> </xsl:apply-templates> </container> <xsl:apply-templates select="node()" /> </xsl:copy> </xsl:template> <!-- generic template to make list of values from any node-set --> <xsl:template match="*" mode="list-values"> <xsl:value-of select="." /> <xsl:if test="position() &lt; last()"> <xsl:text>, </xsl:text> </xsl:if> </xsl:template> </xsl:stylesheet> 

Returns

 <c03 id="ref6488" level="file"> <did> <container label="Box" type="Box">154, 156</container> <unittitle>Clinic Building</unittitle> <unitdate era="ce" calendar="gregorian">1947</unitdate> </did> <c04 id="ref34582" level="file"> <did> <container label="Box" type="Box">156</container> <container label="Folder" type="Folder">3</container> </did> </c04> <c04 id="ref6540" level="file"> <did> <container label="Box" type="Box">156</container> <unittitle>Contact prints</unittitle> </did> </c04> <c04 id="ref6606" level="file"> <did> <container label="Box" type="Box">154</container> <unittitle>Negatives</unittitle> </did> </c04> </c03> 

The generate-id() = generate-id(key(...)[1]) is what is called Muenchian grouping. If you cannot use XSLT 2.0, this is the way to go.

0
source share

All Articles