What is the correct way to handle string encoding in haskell?

I am on windows w / codepage 949 .. both Excel and Notepad.exe happily save encrypted files cp949.

In python, it doesn't hurt with them - with str.encodeand str.decode.

I recently discovered Haskell, and it seems that there are several ways to manipulate strings. real haskell tells me to use ByteStringfor efficient I / O, but I don't see a way to switch between the encodings I use.

I need to read files that are not in the UTF8 encoding and write them to the original encoding. most of them will be cp949.

Internally, my haskell source will be at utf8.

It was not so difficult in python, with the principle strfor IO, unicodefor processing, but on haskell they even lacked built-in support cp949.

so the question is how to do IO on files in different encodings? I have to read, convert, process and write.


change

I tried both options and ... it seems the windows text conversion state is terrible.

text-icu

pros:

  • text seems like a modern, high-level choice for text manipulation.
  • easy to install on windows: just take icu binaries and specify includeand libfolders when installing text-icuusing cabal install.

minuses:

  • IO converters
  • cannot initialize the converter several times (do something with thread safety, I get a runtime error)
  • Lazy bytestrings
  • > 20mb dlls

Iconv

:

  • no monads

:

  • windows
  • , . iconv ( dll) , , haskell, ,
+4
2

Convert module text-icu , text.

, ByteString, - :

import qualified Data.Text.ICU.Convert as Convert

decodeCP949 :: ByteString -> IO Text
decodeCP949 bs = do
    conv <- Convert.open "cp949" Nothing
    return $ Convert.toUnicode conv bs

encodeCP949 :: Text -> IO ByteString
encodeCP949 t = do
    conv <- Convert.open "cp949" Nothing
    return $ Convert.fromUnicode conv t

IO . , , unsafePerfomIO .

+4

Codec.Text.IConv iconv:

http://hackage.haskell.org/package/iconv-0.4.1.2/docs/Codec-Text-IConv.html

convert , CP949 UTF8 ( , ).

(Text → UTF8 ByteString → CP949 ByteString)

, github:

https://github.com/wookay/da/blob/master/haskell/fun/test_encode.hs

+4

All Articles