I am using java sax classes to parse an XML file. If the xml file speaks of version 1.0, everything works fine, but if it says version 1.1, then some attributes become crippled, which gives me incorrect results, but does not throw any exceptions.
My XML file basically looks like this:
<?xml version="1.1" encoding="UTF-8" ?> <gpx> <trk> <name>Name of the track</name> <trkseg> <trkpt lat="12.3456789" lon="1.2345678"> <ele>1234</ele> <time>2013-03-26T12:34:56Z</time> <speed>0</speed> </trkpt> ... and then 419 further identical copies of this trkpt </trkseg> </trk> </gpx>
So, what I expect, when I use sax to parse this file, I need to find 420 trkpt tags and for each of them have lat and lon attributes. In particular, I expect to find 420 βlatβ attributes, which are all β12.3456789β.
For parsing, I create a handler object and pass a stream to it in this local file:
SAXParser saxParser = SAXParserFactory.newInstance().newSAXParser(); inStream = new FileInputStream(file); saxParser.parse(inStream, handler); System.out.println("done");
The handler class extends org.xml.sax.helpers.DefaultHandler and has only one method, startElement , to respond to the opening of the trkpt tag:
public void startElement(String uri, String localName, String qName, Attributes attributes) { if (qName.equals("trkpt") && attributes != null && attributes.getLength() == 2 && attributes.getValue(0).charAt(0) != '1') {
So what is the result? If the xml file is version 1.0, then everything I see is "done". 420 trkpt tags were found, all of them have two attributes, the first of which was always called "lat", and the value of this attribute always started with "1", as I expect. Excellent!
If the xml file is modified to indicate version="1.1" in the first line, I get the following output:
lat = :34.56Z</t lat = :56Z</time done
So, although all of my 420 points should be the same, two of them gave me a completely wrong attribute value. No exceptions are thrown. Another 420 trkpts were found, and all had two attributes called "lat" and "lon". Oddly enough, lon values ββare always fine.
I created this xml file in a text editor using direct copy / paste of the first trkpt, so I am sure that all values ββare identical, I am sure that there are no dots in the XML file with funny attribute values, and I am sure that there are no character values -ascii or object codes or anything else odd in the file.
I tried it with Sun JRE6, OpenJDK6 and OpenJDK7 on three different machines with two different OSs. So either I'm doing something wrong, or this particular XML file is somehow incompatible with xml1.1, or there is a common sax error (which seems unlikely since I expect this to affect a lot of people). Again, note that with xml1.0 everything works fine. Also note that there is nothing special in number 420, just if there are only 100 entries in the file, they will all be correctly parsed. If you have several thousand records, then a certain number of them get their first attribute value, distorted in this way. The length of the attribute value always seems correct, but it draws characters from the wrong point in the file. Index overflow possible?
I tried to remove all speed tags, but the problem still persists if you have enough trkpt. It is also sensitive to extra spaces, so the problem occurs with different points or returns different attribute values ββif I add line breaks between trkpts.