I have a contact that is having problems with SAX while parsing RSS and Atom files. According to him, it is as if the text coming from the elements of the Element is truncated during an apostrophe or sometimes accented character. There seems to be a problem with the encoding too.
I tried SAX, and I also have a truncation, but have not yet been able to dig. I would appreciate some suggestions if any of them have already decided this.
This is the code used by ContentHandler:
public void characters( char[], int start, int end ) throws SAXException { // link = new String(ch, start, end);
Edit: The encoding problem may be related to storing information in a byte array, since I know that Java works in Unicode.
java parsing atom-feed rss sax
James P.
source share