How to modify a huge StAX XML file?

I have huge XML (~ 2 GB) and I need to add new elements and change the old ones. For example, I have:

<books> <book>....</book> ... <book>....</book> </books> 

And I want to get:

 <books> <book> <index></index> .... </book> ... <book> <index></index> .... </book> </books> 

I used the following code:

 XMLInputFactory inFactory = XMLInputFactory.newInstance(); XMLEventReader eventReader = inFactory.createXMLEventReader(new FileInputStream(file)); XMLOutputFactory factory = XMLOutputFactory.newInstance(); XMLStreamWriter writer = factory.createXMLStreamWriter(new FileWriter(file, true)); while (eventReader.hasNext()) { XMLEvent event = eventReader.nextEvent(); if (event.getEventType() == XMLEvent.START_ELEMENT) { if (event.asStartElement().getName().toString().equalsIgnoreCase("book")) { writer.writeStartElement("index"); writer.writeEndElement(); } } } writer.close(); 

But the result was the following:

 <books> <book>....</book> .... <book>....</book> </books><index></index> 

Any ideas?

+7
source share
3 answers

try it

  XMLInputFactory inFactory = XMLInputFactory.newInstance(); XMLEventReader eventReader = inFactory.createXMLEventReader(new FileInputStream("1.xml")); XMLOutputFactory factory = XMLOutputFactory.newInstance(); XMLEventWriter writer = factory.createXMLEventWriter(new FileWriter(file)); XMLEventFactory eventFactory = XMLEventFactory.newInstance(); while (eventReader.hasNext()) { XMLEvent event = eventReader.nextEvent(); writer.add(event); if (event.getEventType() == XMLEvent.START_ELEMENT) { if (event.asStartElement().getName().toString().equalsIgnoreCase("book")) { writer.add(eventFactory.createStartElement("", null, "index")); writer.add(eventFactory.createEndElement("", null, "index")); } } } writer.close(); 

Notes

a new FileWriter (file, true) is added to the end of the file, you hardly need it

equalsIgnoreCase ("book") is a bad idea because XML is case sensitive

+17
source

Well, it’s pretty clear why he behaves like him. What you are actually doing is opening an existing file in the mode of adding and writing items at the end. This is clearly contrary to what you are trying to do.

(In addition: I am surprised that it works as well as if the input side is likely to see elements added by the output side to the end of the file. Indeed, exceptions such as the example of Evgeny Dorofeev give that I expect. The problem is that if you try to read and write a text file at the same time, and either the reader or the writer uses any form of buffering, explicit or implicit, the reader is responsible for viewing partial states.)

To fix this, you need to start by reading from one file and writing to another file. Adding will not work. Then you have to consider that the elements, attributes, contents, etc. that are read from the input file are copied to the output file. Finally, you need to add additional elements to the corresponding points.


And is it possible to open an XML file in a mode such as RandomAccessFile, but write StAX methods in it?

Not. This is theoretically impossible. To be able to move around the XML file structure in a "random" file, you first need to analyze all this and build an index where all the elements are located. Even when you have done this, XML is still saved as characters in the file, and random access does not allow you to insert or delete characters in the middle of the file.

Perhaps your best bet would be to combine XSL and the SAX parser; for example something like this IBM article: http://ibm.com/developerworks/xml/library/x-tiptrax

+3
source

Perhaps this example of reading and writing StAX in a JavaEE tutorial helps: http://docs.oracle.com/javaee/5/tutorial/doc/bnbfl.html#bnbgq

You can download sample tutorials here: https://java.net/projects/javaeetutorial/downloads

For quick access, the example here is: .htm "> http://read.pudn.com/downloads79/ebook/304101/javaeetutorial5/examples/stax/readnwrite/src/readnwrite/EventProducerConsumer.java_.htm

+1
source

All Articles