How to remove duplicate XML nodes using XSLT

Question

How to remove duplicate XML nodes using XSLT

I have a very long XML file, for example

<Root> <ele1> <child1>context1</child1> <child2>test1</child2> <child1>context1</child1> </ele1> <ele2> <child1>context2</child1> <child2>test2</child2> <child1>context2</child1> </ele2> <ele3>...........<elen> </Root>

Now I want to remove all the second <child1> in each <ele> using xslt, is this possible? The result will be like this:

 <Root> <ele1> <child1>context1</child1> <child2>test1</child2> </ele1> <ele2> <child1>context2</child1> <child2>test2</child2> </ele2> <ele3>...........<elen> </Root>

Thanks BR

Allen

+3

xml xslt

Allen Dec 10 '08 at 11:00

source share

3 answers

Dimitre novatchev · Answer 1 · 2008-12-11T02:50:32+0000

This question requires a slightly more detailed answer than just pointing to a good Muenchian Grouping source.

The reason is that for the necessary grouping, it is necessary to identify the names of all the child elements of the element "ele [SomeString]" and their parent . Such grouping requires a key definition that is uniquely determined by both unique sources, usually through concatenation.

This conversion is :

  <xsl: stylesheet version = "1.0"
  xmlns: xsl = "http://www.w3.org/1999/XSL/Transform">
  <xsl: output omit-xml-declaration = "yes" indent = "yes" />

  <xsl: key name = "kElByName" match = "*"
       use = "concat (generate-id (..), '+', name ())" />

     <xsl: template match = "node () | @ *">
       <xsl: copy>
         <xsl: apply-templates select = "node () | @ *" />
       </ xsl: copy>
     </ xsl: template>

     <xsl: template match = "* [starts-with (name (), 'ele')]">
       <xsl: copy>
         <xsl: copy-of select = "@ *" />
         <xsl: apply-templates select =
          "* [generate-id ()
            =
             generate-id (key ('kElByName',
                         concat (generate-id (..), '+', name ())
                         )[1])
             ] "
          />
       </ xsl: copy>
     </ xsl: template>
 </ xsl: stylesheet>

when applied to this XML document :

  <Root>
     <ele1>
         <child1> context1 </child1>
         <child2> test1 </child2>
         <child1> context1 </child1>
     </ele1>
     <ele2>
         <child1> context2 </child1>
         <child2> test2 </child2>
         <child1> context2 </child1>
     </ele2>
     <ele3>
         <child2> context2 </child2>
         <child2> test2 </child2>
         <child1> context1 </child1>
     </ele3>
 </Root>

creates the desired result :

  <Root>
     <ele1>
         <child1> context1 </child1>
         <child2> test1 </child2>
     </ele1>
     <ele2>
         <child1> context2 </child1>
         <child2> test2 </child2>
     </ele2>
     <ele3>
         <child2> context2 </child2>
         <child1> context1 </child1>
     </ele3>
 </Root>

Abach · Answer 2 · 2012-05-20T02:23:51+0000

If the provided FP XML presents its own question (and the second <child1> inside each <ele*> element must be deleted), then Muenchian Grouping is not required:

XSLT:

 <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output omit-xml-declaration="no" indent="yes"/> <xsl:strip-space elements="*"/> <!-- Identity Template: copies everything as-is --> <xsl:template match="node()|@*"> <xsl:copy> <xsl:apply-templates select="node()|@*"/> </xsl:copy> </xsl:template> <!-- Remove the 2nd <child1> element from each <ele*> element --> <xsl:template match="*[starts-with(name(), 'ele')]/child1[2]" /> </xsl:stylesheet>

When executed against the provided XML:

 <?xml version="1.0" encoding="UTF-8"?> <Root> <ele1> <child1>context1</child1> <child2>test1</child2> <child1>context1</child1> </ele1> <ele2> <child1>context2</child1> <child2>test2</child2> <child1>context2</child1> </ele2> </Root>

... the desired result is obtained:

 <?xml version="1.0" encoding="UTF-8"?> <Root> <ele1> <child1>context1</child1> <child2>test1</child2> </ele1> <ele2> <child1>context2</child1> <child2>test2</child2> </ele2> </Root>

annakata · Answer 3 · 2008-12-10T11:07:24+0000

Your xml and question are unclear, but what you are looking for is usually called Muenchian Grouping - this is a different request for individual nodes. Using the appropriate keys, this can be done very efficiently.

How to remove duplicate XML nodes using XSLT

More articles: