Xpath: select node, but not specific children

Question

Xpath: select node, but not specific children

I have a structure similar to the following:

<page id='1'> <title>Page 1</title> <page id='2'> <title>Sub Page 1</title> </page> <page id='3'> <title>Sub Page 2</title> </page> </page> <page id='4'> <title>Page 2</title> </page>

I need to select a page by ID, but if there are streaming pages on this page, I do not want to return these elements, but I need other elements of this page. If I choose Page 1 I want to return the title, but not the child pages ...

 //page[@id=1]

The above page gives me page 1, but how can I exclude sub pages? In addition, the page can have an arbitrary number of elements.

 //page[@id=1]/*[not(self::page)]

I found that this gives me the data I want. However, this data is returned as an array of objects with one object per element and, apparently, exclude the names of the elements. I am using PHP SimpleXML for what it costs.

+7

xpath

Ben Aug 19 '11 at 1:02

source share

3 answers

If you are only interested in the title element, this will work:

 //page[@id=1]/title

If you need other auxiliary page elements, I'm not sure XPath is the right tool for you. It sounds more like something XSLT is suitable for, because what you are really doing is transforming your data.

+1

Scott Ferguson Aug 19 '11 at 1:14

source share

If the page always has a title:

 //page[@id='1']/*[not(boolean(./title))]

0

Msyk Aug 19 '11 at 1:59

source share

Dimitre novatchev · Accepted Answer · 2011-08-19T13:25:54+0000

Using

 //page[@id=$yourId]/node()[not(self::page)]

This selects all nodes that are not page and are children of any page in the document whose string value has the id attribute equal to the line contained in $yourId (most likely, you replace $yourId above with the specific desired line, for example '1' )

Here is a simple XSLT-based check :

 <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:strip-space elements="*"/> <xsl:param name="pId" select="3"/> <xsl:template match="/"> <xsl:copy-of select="//page[@id=$pId]/node()[not(self::page)]"/> </xsl:template> </xsl:stylesheet>

when this conversion is applied to the provided XML document (wrapped in one top node to make it correct):

 <pages> <page id='1'> <title>Page 1</title> <page id='2'> <title>Sub Page 1</title> </page> <page id='3'> <title>Sub Page 2</title> </page> </page> <page id='4'> <title>Page 2</title> </page> </pages>

required, the correct result is obtained :

 <title>Sub Page 2</title>

Pay attention . It has been suggested that the id value uniquely identifies a page . If this is not the case, the proposed XPath expression will select all page elements whose id attribute has the value of the string $yourId .

If this is the case, and only one page element should be selected, the OP should indicate which of the many page elements with this id should be selected.

For example, this may be the first :

 (//page[@id=$yourId]/node()[not(self::page)])[1]

or last :

 (//page[@id=$yourId]/node()[not(self::page)])[last()]

or...

Xpath: select node, but not specific children

More articles: