Remove multiple nodes from xml file with sax and java

I am new to XML parsing using Java and SAX parser. I have a really large XML file, and because of its size, I was advised to use a SAX parser. I have finished analyzing my tasks, and it works as expected. Now there remains one task with the task of XML: removing / updating some nodes at the request of the user.

I can find all tags by their names, change their data attributes, etc. If I can do this using SAX, deletion may also be possible.

The XML example describes some functions in some cases. User inputs are case names ( case1 , case2 ).

 <ruleset> <rule id="1"> <condition> <case1>somefunctionality</case1> <allow>true</allow> </condition> </rule> <rule id="2"> <condition> <case2>somefunctionality</case2> <allow>false</allow> </condition> </rule> </ruleset> 

If the user wants to delete one of these cases (for example, case1 ), not only the case1 tag, the full rule tag must be deleted. If case1 needs to be removed, the XML will be as follows:

 <ruleset> <rule id="2"> <condition> <case2>somefunctionality</case2> <allow>false</allow> </condition> </rule> </ruleset> 

My question is: can this be done using SAX? At the moment, I cannot use the DOM or any other parser. Even worse is another option: search for strings. How can this be done with SaxParser?

+4
source share
3 answers

Try it like

  XMLReader xr = new XMLFilterImpl(XMLReaderFactory.createXMLReader()) { private boolean skip; @Override public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException { if (qName.equals("rule")) { if (atts.getValue("id").equals("1")) { skip = true; } else { super.startElement(uri, localName, qName, atts); skip = false; } } else { if (!skip) { super.startElement(uri, localName, qName, atts); } } } public void endElement(String uri, String localName, String qName) throws SAXException { if (!skip) { super.endElement(uri, localName, qName); } } @Override public void characters(char[] ch, int start, int length) throws SAXException { if (!skip) { super.characters(ch, start, length); } } }; Source src = new SAXSource(xr, new InputSource("test.xml")); Result res = new StreamResult(System.out); TransformerFactory.newInstance().newTransformer().transform(src, res); 

Output

 <?xml version="1.0" encoding="UTF-8"?><ruleset> <rule id="2"> <condition> <case2>somefunctionality</case2> <allow>false</allow> </condition> </rule> </ruleset> 
+5
source

What needs to be built is the SAX event buffer.

when you come accros a <rule> , you need to save it (or the information necessary to regenerate it), and all the other events that occur between it and your “case” that you want to delete.

If the “rule” you saved matches the one you want to delete, simply discard the information and continue.

If the “rule” you saved is not the one you want to delete, you must restore the sax events that were saved and continue.

0
source

SAX is most commonly used for reading / parsing XML. But there is an article on how to use SAX to write files. And this chapter seems to be available online - see:

http://xmlwriter.net/sample_chapters/Professional_XML/31100604.shtml

[The article is dated 1999, so it uses the old version of SAX, but the concepts are still applicable]

The basic idea is to create a custom DocumentHandler / ContentHandler. Whenever it receives an SAX event, it serializes and writes the event to stream / file / independently. This way you use your input document as a source of sax events and forward these events to XMLOutputter.

The tough part comes to the point that you can parse an XML document in a SAX event stream, manage the XMLOutputter, and generate an exact copy of the input file. Once you get this work, you can go to the editing logic, where you read your rules and use them to modify the output file.

This is much more than DOM, JDOM, XSLT, etc., but it can help in your situation, because you never need to store the entire document in memory.

0
source

All Articles