How to check opening and closing tags in an XML file using java?

I have an xml file, for example:

<file> <students> <student> <name>Arthur</name> <height>168</height> </student> <student> <name>John</name> <height>176</height> </student> </students> </file> 

How to check if there is an end tag for each opening tag? For example, if I do not provide the end tag as:

 <file> <students> <student> <name>Arthur</name> <height>168</height> // Ending tag for student missing here <student> <name>John</name> <height>176</height> </student> </students> </file> 

How to continue parsing the rest of the file?

I tried using the SAX parser as described here , but it is not very suitable for me, because it throws an exception if I do not provide a closing tag, as in the second xml code that I provided.

+6
source share
3 answers

An XML file that does not check your condition "for every opening tag, there is an end tag" is not well-formed . Checking that the XML file is well-formed is the first task of the XML parser (this is its first task). Therefore, you need an XML parser.

0
source

There is an error in the textbook you found. characters() can be called multiple times for the same element ( source ). The correct way to mark the end of an element is to reset the corresponding boolean states inside endElement() . The comments section contains code that shows the necessary changes.

If this problem is fixed, you can perform an error check in startElement() to make sure that the file is not trying to start an invalid element given the current state. This will also allow you to make sure that the name element is found only inside the student element.

0
source

You can implement the following algorithm (pseudo-code):

 String xml = ... stack = new Stack() while True: tag = extractNextTag(xml) // no new tag is found if tag == null: break if (tag.isOpening()): stack.push(tag.name) else: oldTagName = stack.pop() if (oldTagName != tag.name): error("Open/close tag error") if ! stack.isEmpty(): error("Open/close tag error") 

you can implement the extractNewTag function with 10-20 lines of code using some knowledge of parsers or just write simple regular expressions. Of course, when you are looking for a new tag, you need to start the search with the character that follows the last tag that you found.

0
source

All Articles