Extract nodes from multiple xml files

I have three xml files with a similar structure, and I would like to use the xpath expression to extract all matching nodes in these files and write them to one of them.

Do you know a good tool for this?

I think of something like

$supermagicxpathtool -x "//whoopdee" file1.xml file2.xml file3.xml > resultfile.xml 
+1
source share
6 answers

xmlstarlet can retrieve nodes, but I'm not sure if it can join such results.

+2
source

XPath can only select nodes; it cannot write to a file.

In XPath 1.0, there is no standard way to refer to a single expression node that belongs to more than one XML document. If the programming language that hosts XPath is XSLT, then document nodes from three XML documents can be in three separate xsl:variable s: $doc1 , $doc2 and $doc3 .

 $doc1//whoopdee | $doc2//whoopdee | $doc3//whoopdee 

Alternatively, the XSLT function document() can be used directly:

  document('file1.xml')//whoopdee | document('file2.xml')//whoopdee | document('file3.xml')//whoopdee 

To output the result of XPath expressions above using XSLT, simply write:

 <xsl:copy-of select="$doc1//whoopdee | $doc2//whoopdee | $doc3//whoopdee"> 

or

 <xsl:copy-of select= "document('file1.xml')//whoopdee | document('file2.xml')//whoopdee | document('file3.xml')//whoopdee "> 

In XPath 2.0, you can use the standard doc() function and will not depend on the XPath host.

Command line

You can use any XSLT processor that allows you to instantiate a command line. Most XSLT processors allow this. They also allow you to pass simple parameters on the command line - usually in the format name=value . Finally, most XSLT processors allow you to specify the target file for the result as an option. Here is a link to the Saxon documentation for using it on the command line:

http://www.saxonica.com/documentation/using-xsl/commandline.html

+2
source

Using the xml-cat package xml-coreutils adds the look of Unix:

 xml-cat file1.xml file2.xml file3.xml | \ xmlstarlet sel -R -t -c /root/whoopdee - | \ xmlstarlet fo > resultfile.xml 
+1
source

xmlstarlet can copy node to another document (so this seems like the first step to a solution):

 # code example from: # "How to copy a node to another document", # http://sourceforge.net/projects/xmlstar/forums/forum/226076/topic/3558346 xml sel -R -t -c / -c "document('f2.xml')" f1.xml | \ xml ed -m /xml-select/Module_0 /xml-select/cnpsXML/Destinations/Module_0/Filter_1 | \ xml sel -t -c /xml-select/* - | xml fo # In pseudo code: # 1. Combine both documents into one (using -R to keep the combo a valid XML file - genius!) # 2. Move the element from f2.xml to its final destination 

To extract all matching nodes into plain (without text) text or xsl, we can do the following:

 xmlstarlet sel -t -m "//whoopdee" -v '@*' -v '.' -n file1.xml > resultfile xmlstarlet sel -C -t -m "//whoopdee" -v '@*' -v '.' -n file1.xml > resultfile.xsl xml tr resultfile.xsl file1.xml 
0
source

So creating my previous xmlstarlet post seems to be done as follows:

 xmlstarlet sel -R -t -c / -c "document('file2.xml')" -c "document('file3.xml')" file1.xml | \ xmlstarlet sel -R -t -c /xml-select/*/whoopdee - | xmlstarlet fo > resultfile.xml xmlstarlet val resultfile.xml 
0
source

You seem to be looking for the xpath tool, which is in the libxml-xpath-perl package on Ubuntu and most likely with Debian based distributions.

 xpath [-s suffix] [-p prefix] [-q] -e query [-e query] ... [file] ... 
0
source

Source: https://habr.com/ru/post/1315595/


All Articles