I don't know much about Ruby (or Rails), but I think the problem is the lack of control over your character encodings.
First you have to decide what encoding you store in your database. Then, before writing to the database, it is necessary to convert all the text into this encoding. To do this, you first need to know which encoding it should start with.
A frequently repeated tip is to decode all input from any encoding that it uses into unicode (if your language supports it) as soon as possible after you get control over it. Then you know that all the text you process in your program is unicode. On the other hand, encode the text to any output code you want as a last step before outputting it.
The key is to always know which encoding of the text fragment is used anywhere in your code.
Epcylon Nov 22 '09 at 20:02 2009-11-22 20:02
source share