Split an XML document separately, creating multiple output files from duplicate elements

I need to take an xml file and create some output xml files from duplicate nodes of the input file. The source file "AnimalBatch.xml" looks like this:

<?xml version="1.0" encoding="utf-8" ?>
<Animals>
<Animal id="1001">
<Quantity>One</Quantity>
<Adjective>Red</Adjective>
<Name>Rooster</Name>
</Animal>
<Animal id="1002">
<Quantity>Two</Quantity>
<Adjective>Stubborn</Adjective>
<Name>Donkeys</Name>
</Animal>
<Animal id="1003">
<Quantity>Three</Quantity>
<Color>Blind</Color>
<Name>Mice</Name>
</Animal>
</Animals>

The program should split the repeating "Animal" and create 3 files with the names: Animal_1001.xml, Animal_1002.xml and Animal_1003.xml

Each output file should contain only the corresponding element (which will be the root). The id attribute from AnimalsBatch.xml will contain the sequence number for the Animal_xxxx.xml file names. The id attribute does not have to be in the output files.


Animal_1001.xml:
<?xml version="1.0" encoding="utf-8"?>
<Animal>
<Quantity>One</Quantity>
<Adjective>Red</Adjective>
<Name>Rooster</Name>
</Animal>


Animal_1002.xml
<?xml version="1.0" encoding="utf-8"?>
<Animal>
<Quantity>Two</Quantity>
<Adjective>Stubborn</Adjective>
<Name>Donkeys</Name>
</Animal>


Animal_1003.xml>
<?xml version="1.0" encoding="utf-8"?>
<Animal>
<Quantity>Three</Quantity>
<Adjective>Blind</Adjective>
<Name>Mice</Name>
</Animal>

I want to do this with an XmlDocument, since it should work on .Net 2.0.

My program looks like this:

  static void Main(string[] args) { string strFileName; string strSeq; XmlDocument doc = new XmlDocument(); doc.Load("D:\\Rick\\Computer\\XML\\AnimalBatch.xml"); XmlNodeList nl = doc.DocumentElement.SelectNodes("Animal"); foreach (XmlNode n in nl) { strSeq = n.Attributes["id"].Value; XmlDocument outdoc = new XmlDocument(); XmlNode rootnode = outdoc.CreateNode("element", "Animal", ""); outdoc.AppendChild(rootnode); // Put the wrapper element into outdoc outdoc.ImportNode(n, true); // place the node n into outdoc outdoc.AppendChild(n); // This statement errors: // "The node to be inserted is from a different document context." strFileName = "Animal_" + strSeq + ".xml"; outdoc.Save(Console.Out); Console.WriteLine(); } Console.WriteLine("END OF PROGRAM: Press <ENTER>"); Console.ReadLine(); } 

I think I have 2 problems.

A) After running ImportNode on node n in outdoc, I call outdoc.AppendChild (n), which complains: "The inserted node is a different document context." I don’t know if this is a problem of the area referencing node n in the ForEach loop, or if I am somehow using ImportNode () or AppendChild incorrectly. The 2nd argument in ImportNode () is set to true because I want the child elements of Animal (3 fields, arbitrarily named "Quantity", "Adjective name" and "Name") to fall into the target file.

B) The second problem is getting the Animal element in outdoc. I get `` but I need '', so I can place node n inside it. I think my problem is how I do it: outdoc.AppendChild (rootnode);

To show xml, I do: outdoc.Save (Console.Out); I have code to save () to an output file that works if I can correctly build outdoc.

A similar question arises when: Split XML into multiple XML files , but I still do not understand the solution code. I think that I am very close to this approach and would appreciate any help you can provide.

I am going to accomplish the same task with XmlReader, since I will need to process large input files, and I understand that XmlDocument reads all this and may cause memory problems.

+4
source share
2 answers

This is a simple method that seems like what you are looking for.

 public void test_xml_split() { XmlDocument doc = new XmlDocument(); doc.Load("C:\\animals.xml"); XmlDocument newXmlDoc = null; foreach (XmlNode animalNode in doc.SelectNodes("//Animals/Animal")) { newXmlDoc = new XmlDocument(); var targetNode = newXmlDoc.ImportNode(animalNode, true); newXmlDoc.AppendChild(targetNode); newXmlDoc.Save(Console.Out); Console.WriteLine(); } } 
+3
source

This approach works without using the "var targetnode" statement. It creates an XmlNode object, called targetNode, from the outdoc "Animal" element in a ForEach loop. I think the main things that were in my source code were: A) I used nodelist nl incorrectly. And B) I could not β€œimport” node n, I think, because it was specifically related to doc. It had to be created as your own Node.

The problem with the previously proposed solution was to use the keyword "var". My program should accept 2.0, and it includes v3.0. I like Rogers' decision because it is concise. For me - I wanted to do everything as a separate expression.

  static void SplitXMLDocument() { string strFileName; string strSeq; XmlDocument doc = new XmlDocument(); // The input file doc.Load("D:\\Rick\\Computer\\XML\\AnimalBatch.xml"); XmlNodeList nl = doc.DocumentElement.SelectNodes("//Animals/Animal"); foreach (XmlNode n in nl) { strSeq = n.Attributes["id"].Value; // Animal nodes have an id attribute XmlDocument outdoc = new XmlDocument(); // Create the outdoc xml document XmlNode targetNode = outdoc.CreateElement("Animal"); // Create a separate node to hold the Animal element targetNode = outdoc.ImportNode(n, true); // Bring over that Animal targetNode.Attributes.RemoveAll(); // Remove the id attribute in <Animal id="1001"> outdoc.ImportNode(targetNode, true); // place the node n into outdoc outdoc.AppendChild(targetNode); // AppendChild to make it stick strFileName = "Animal_" + strSeq + ".xml"; outdoc.Save(Console.Out); Console.WriteLine(); outdoc.Save("D:\\Rick\\Computer\\XML\\" + strFileName); Console.WriteLine(); } } 
0
source

All Articles