How to determine file encoding?

I am trying to figure out the file encoding in Windows using Go. After doing some research, I found a lot of recommendations for Mozilla Charset Detectors (chardet), but they are difficult to compile and I was out of luck.

I also found libguess, and it seems to be widely used on Linux, but I can't get it to work on Windows.

What is the best way to do this? Is there an actual standard library for use with Windows?

+7
source share
2 answers

You can use the python package: chardet .

+1
source

You may be interested in Enca , An extremely naive character parser. I think you could try to read the file using all the candidate encodings, and calculate how far each attempt is a β€œstandard” character frequency distribution for the language. Enca requires some information about the language, but Im not sure if it uses this approach. (This is just an idea; it can be terribly wrong.)

0
source

All Articles