I have a Nokia N900 phone, and when sending an SMS, the widget displays the number of characters remaining in the message (and the number of actual short messages needed to send the entire message).
I live in France, where I noticed the following unusual thing when writing messages with non-ASCII characters:
- some non-ASCII characters are encoded on the same char / byte, for example. "é", "è", "à", "ù"
- the presence of some non-ASCII characters, such as "ç", "ê", "ô", consumes a fixed amount of 90 char / bytes + 1 bytes per character
- the presence of the second "ç", "ê", etc. consumes only 1 extra byte.
So, I am interested in how messages are encoded, because I do not see this scheme corresponding to the traditional encodings that I know (iso-8859-1, UTF-8, UTF-16 ...).
gurney alex
source share