HTTP 1.1 RFC2616 refers to ISO-8859-1, which is a single-byte character set based on the Latin alphabet.
Given that HTTP traffic should be one byte, I also use the latin1 character set for my similar logs. The solution was just to make my indexes smaller.
If you use UTF8 with VARCHAR, only characters that are multibyte require extra bytes, so in a table space this is not much more. However, indexes are kept fixed in width, so they are filled with spaces just in case you need them (UTF8 indexes are three times larger than latin1 indexes).
This does not affect me if a random odd title is unreadable. However, if you do not index the column, you can also use UTF8.
Marcus adams
source share