I am using the JExcel library to read Excel spreadsheets. Each cell in the spreadsheet can contain localization strings in any of 44 languages (English, Portuguese, French, Chinese, etc.). Today I am not telling the API anything about the encoding it should use. His treatment of the Chinese is fine, but it always wraps Portuguese and German. Somehow, the default encoding (MacRoman in my dev block, UTF-8 in production) cannot correctly interpret the lines that it pulls from an Excel workbook. There must be something wrong with the way JExcel interprets the character encoding of the file.
It is said ...
Are all lines in an excel workbook encoded with the same character set?
Is there any book metadata, I may ask, what is this character set (I have not found it yet)?
If I run all the cells through something like jchardet (http://jchardet.sourceforge.net/), maybe he can guess the character encoding for the whole book (this largely depends on the first question, “yes, all the injections in this book encoded with the same character set ")?
So many questions, so little time.
source share