It’s hard for me to figure out how to deal with this problem:
I am developing a web tool for an Italian university, and I have to show words with accents (for example, è, ù, ...); sometimes I get these words from a PostgreSql table (UTF8 encoding), but basically I have to read long passages from a file. These files are encoded as utf-8 xml and display perfectly in Smultron or any utf-8 editor (they were created to parse in old python files with objects such as è instead of "è").
I wrote a java class that extracts the appropriate segments from an xml file that works as follows:
String s = parseText(filename, position)
if I write the returned String to a file, everything looks fine; the problem is that if I do
out.write(s)
on the jsp page, I get weird characters. By the way, I use
String s = getWordFromPostgresql(...)
out.write(s)
in the same jsp, and it displays OK.
Any clues?
Thanks Nicola
@ krosenvold
Thank you for your answer, however, this directive is already on the page, but it does not work (in fact, it "works", but only for the rows that I get from the database). I think there is something about reading from files, but I cannot understand ... they work in "java" but not in "jsp" (they cannot think of a better explanation ...)
here is a basic example extracted from real code: the method of reading from files returns a Map from Mark (an object representing the position in the text) to String (containing the text):
this is on the .jsp page (with the utf directive mentioned in the posts above)
// ... Map<Mark, String> map = TestoMarkParser.parseMarks(...); out.write(map.get(m));
and this is the result:
"Fu per√ ≤ cos√¨ in uso il Genere Enharmonico, che quelli quali vi si esercitavano",
if I put the same code in a java class and replaced out.write with System.out.println, the result would be the following:
"Fu però così in uso il Genere Enharmonico, che quelli quali vi si esercitavano",
I am doing some analysis with a hex editor, here it is:
source line: "fu però così"
ò in xml file: C3 B2
ò displayed out.write () in jsp file: E2 88 9A E2 89 A4
ò is written to the file via:
FileWriter w = new FileWriter(new File("out.txt")); w.write(s); // s is the parsed string w.close();
C3 B2
printing the values of each character as int
0: 70 = F 1: 117 = u 2: 32 = 3: 112 = p 4: 101 = e 5: 114 = r 6: 8730 = 7: 8804 = 8: 32 = 9: 99 = c 10: 111 = o 11: 115 = s 12: 8730 = 13: 168 = 14: 10 = `