When I view HTML from a browser, question marks are encoded as 0xEF 0xBF 0xBD, which is UTF-8 encoded for byte order character or specification, as well as U + FEFF. Thus, for some reason, HTML is not passed as reasonable UTF-8 (although it is indeed valid UTF-8).
Jonathan leffler
source share