How to read a large XML file without loading it into memory and using XElement

I want to read a large XML file (100 + M). Due to its size, I do not want to load it into memory using XElement. I use linq-xml queries for parsing and reading.

What is the best way to do this? Any example of combining XPath or XmlReader with linq-xml / XElement?

Please, help. Thanks.

+6
xml xpath large-files linq-to-xml xelement
source share
3 answers

Yes, you can combine XmlReader with the XNode.ReadFrom method , see the example in the documentation that uses C # to selectively process the nodes found by XmlReader as XElement.

+7
source share

The sample code in the MSDN documentation for the XNode.ReadFrom method is as follows:

 class Program { static IEnumerable<XElement> StreamRootChildDoc(string uri) { using (XmlReader reader = XmlReader.Create(uri)) { reader.MoveToContent(); // Parse the file and display each of the nodes. while (reader.Read()) { switch (reader.NodeType) { case XmlNodeType.Element: if (reader.Name == "Child") { XElement el = XElement.ReadFrom(reader) as XElement; if (el != null) yield return el; } break; } } } } static void Main(string[] args) { IEnumerable<string> grandChildData = from el in StreamRootChildDoc("Source.xml") where (int)el.Attribute("Key") > 1 select (string)el.Element("GrandChild"); foreach (string str in grandChildData) Console.WriteLine(str); } } 

But I found that the StreamRootChildDoc method in this example should be changed as follows:

  static IEnumerable<XElement> StreamRootChildDoc(string uri) { using (XmlReader reader = XmlReader.Create(uri)) { reader.MoveToContent(); // Parse the file and display each of the nodes. while (!reader.EOF) { if (reader.NodeType == XmlNodeType.Element && reader.Name == "Child") { XElement el = XElement.ReadFrom(reader) as XElement; if (el != null) yield return el; } else { reader.Read(); } } } } 
+5
source share

Just keep in mind that you will have to read the file sequentially and refer to siblings or descendants will at best be slow and impossible in the worst case. Otherwise, the @MartinHonnn key has a key.

0
source share

All Articles