Some UTF-8 characters are not displayed in the browser

Some UTF-8 characters, such as UTF-8, are equivalent to C2 96 (hyphen). In the browser, it appears as (utf box with 00 96). And not like a β€œ-” (hyphen). Any reasons for this behavior? How do we fix this?

http://stuffofinterest.com/misc/utf8.php?s=128 (see this URL for codes)

I found that this can be handled with html objects. Is there a way to show this without converting to html objects?

+4
source share
3 answers

I suspect that this is because the characters between U + 0080 and U + 009F inclusive are control characters. I am still a little surprised that they show differently when they are encoded directly in HTML than using entities, but basically you should not use them for starters. U + 0096 is not really a hyphen, it's the beginning of a protected area.

For more information, see U + 0080-U + 00FF code chart . In principle, try to avoid control characters ...

+5
source

The character you are talking about is a hyphen, not a hyphen. Its Unicode code point is U + 2013, and its UTF-8 encoding is E2 80 93 , not C2 96 . This table you are associated with is incorrect. The first two columns have nothing to do with UCS-2 or Unicode; they actually contain windows-1252 encodings for the corresponding characters. Columns labeled "UTF-8 Hex" and "UTF-8 Native" are simply erroneous, at least for lines marked 128 to 159. Objects – and – They are en-dash, but UTF-8 C2 96 represents a non-displayable control character.

In any case, you will not need to encode these characters manually. Just tell the text editor (or what you use to create the content) to save the file as UTF-8.

+5
source

Two reasons come to mind:

  • Are you sure you have the correct character code in your browser? Best check in some kind of hexadecimal lookup.
  • The font you use does not have the glyph defined at this code point.
+1
source

All Articles