UTF-8 , . , , (, , , ,...). , . , :
- MSB (, 7- ASCII).
- : 110xxxxx 10xxxxxx
- : 1110xxxx 10xxxxxx 10xxxxxx
- : 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
, , UTF-8 ( XML), - UTF-8 ( - UTF-8, , Cp1252). XML- UTF-8, -, ( ). : 110xxxxx 10xxxxxx ( , 01xxxxxx 11xxxxxx 00xxxxxx, ).
, . XML, Windows-1252, ANSI, , -ASCII- ( > 127) .
:
, , ASCII ( , ), 2 XML ASCII- 8- (ANSI, Windows-XXXX, Mac-Roman ..). :
XmlPullParser parser = Xml.newPullParser();
parser.setInput(url.open(), "ISO-8859-1");