The main contribution of this answer is the solution (at the end), which can be used with an infinite number of formats , simply by specifying all the alternative "entry" names in the external (global) parameter $postElements and all the "published" alternative names in the external (global) parameter $pub-dateElements .
Other than that , here's how to specify an XPath expression that selects all elements /rss//item and all /feed//entry .
In the simple case of the two possible document formats, this (as @Josh Davis suggested) the Xpath expression works correctly:
/rss//item | /feed//entry
A more general XPath expression allows you to select the desired elements from a set of unlimited number of document formats :
/*[contains($topElements, concat('|',name(),'|'))] //*[contains($postElements, concat('|',name(),'|'))]
where the variable $topElements should be replaced by a line divided by the lines of all possible names for the top element, and $postElements should be replaced by a line with the channel marking of all possible names for the "entry" element. We also allow the "input" elements to be at different depths in different document formats.
In particular, for this particular case, the XPath expression will be:
/*[contains('|feed|rss|', concat('|',name(),'|'))] //*[contains('|item|entry|', concat('|',name(),'|'))]
The rest of this post shows how the complete desired processing can be completely done in XSLT - easily and with elegance.
I. Gentle introduction
This kind of processing is quick and easy with XSLT :
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:strip-space elements="*"/> <xsl:template match="/"> <myFeed> <xsl:apply-templates/> </myFeed> </xsl:template> <xsl:template match="channel|feed"> <xsl:apply-templates select="*"> <xsl:sort select="pubDate|published" order="descending"/> </xsl:apply-templates> </xsl:template> <xsl:template match="item|entry"> <post> <xsl:apply-templates mode="identity"/> </post> </xsl:template> <xsl:template match="pubDate|published" mode="identity"> <publicationDate> <xsl:apply-templates/> </publicationDate> </xsl:template> <xsl:template match="node()|@*" mode="identity"> <xsl:copy> <xsl:apply-templates select="node()|@*" mode="identity"/> </xsl:copy> </xsl:template> </xsl:stylesheet>
when this conversion is applied to this XML document (in format 1):
<rss> <channel> <item> <pubDate>2011-06-05</pubDate> <title>Title1</title> <description>Description1</description> <link>Link1</link> <author>Author1</author> </item> <item> <pubDate>2011-06-06</pubDate> <title>Title2</title> <description>Description2</description> <link>Link2</link> <author>Author2</author> </item> <item> <pubDate>2011-06-07</pubDate> <title>Title3</title> <description>Description3</description> <link>Link3</link> <author>Author3</author> </item> </channel> </rss>
and when it applies to this equivalent document (in format 2):
<feed> <entry> <published>2011-06-05</published> <title>Title1</title> <description>Description1</description> <link>Link1</link> <author>Author1</author> </entry> <entry> <published>2011-06-06</published> <title>Title2</title> <description>Description2</description> <link>Link2</link> <author>Author2</author> </entry> <entry> <published>2011-06-07</published> <title>Title3</title> <description>Description3</description> <link>Link3</link> <author>Author3</author> </entry> </feed>
in both cases the same desired, correct result is obtained :
<myFeed> <post> <publicationDate>2011-06-07</publicationDate> <title>Title3</title> <description>Description3</description> <link>Link3</link> <author>Author3</author> </post> <post> <publicationDate>2011-06-06</publicationDate> <title>Title2</title> <description>Description2</description> <link>Link2</link> <author>Author2</author> </post> <post> <publicationDate>2011-06-05</publicationDate> <title>Title1</title> <description>Description1</description> <link>Link1</link> <author>Author1</author> </post> </myFeed>
II. Complete solution
This can be generalized to a parameterized solution :
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:strip-space elements="*"/> <xsl:param name="postElements" select= "'|entry|item|'"/> <xsl:param name="pub-dateElements" select= "'|published|pubDate|'"/> <xsl:template match="node()|@*" name="identity"> <xsl:copy> <xsl:apply-templates select="node()|@*" mode="identity"/> </xsl:copy> </xsl:template> <xsl:template match="/"> <myFeed> <xsl:apply-templates select= "//*[contains($postElements, concat('|',name(),'|'))]"> <xsl:sort order="descending" select= "*[contains($pub-dateElements, concat('|',name(),'|'))]"/> </xsl:apply-templates> </myFeed> </xsl:template> <xsl:template match="*"> <xsl:choose> <xsl:when test= "contains($postElements, concat('|',name(),'|'))"> <post> <xsl:apply-templates/> </post> </xsl:when> <xsl:when test= "contains($pub-dateElements, concat('|',name(),'|'))"> <publicationDate> <xsl:apply-templates/> </publicationDate> </xsl:when> <xsl:otherwise> <xsl:call-template name="identity"/> </xsl:otherwise> </xsl:choose> </xsl:template> </xsl:stylesheet>
This conversion can be used with an infinite number of formats by simply specifying all the alternative "entry" names in the external (global) parameter $postElements and all the "published" alternative names in the external (global) parameter $pub-dateElements .
Anyone can try this conversion to make sure that when applied to the two XML documents listed above, it again produces the same, desired and correct result.