Convert between latin encoded data .ByteString and Data.Text

Since the latin-1 character set (aka ISO-8859-1) is embedded in the Unicode character set as its lower 256 code points, I expect the conversion to be trivial, but I have not seen any latin-1 in Data.Text.Encodingwhich contains only conversion functions for general UTF encodings.

What is the recommended and / or efficient way to convert values Data.ByteStringencoded in latin-1 representation to Data.Textvalues?

+5
source share
1 answer

The answer is at the top of the page:

, text-icu: http://hackage.haskell.org/package/text-icu

GHCi:

λ> import Data.Text.ICU.Convert
λ> conv <- open "ISO-8859-1" Nothing
λ> Data.Text.IO.putStrLn $ toUnicode conv $ Data.ByteString.pack [198, 216, 197]
ÆØÅ
λ> Data.ByteString.unpack $ fromUnicode conv $ Data.Text.pack "ÆØÅ"
[198,216,197]

, , -1 Unicode, pack/unpack Data.ByteString.Char8 -1 / String, Text pack/unpack Data.Text.

+13

All Articles