Reading a file using UTF-8 in Haskell as an input / output string

I have the following code that works fine if the file does not have utf-8 characters:

 module Main where import Ref main = do text <- getLine theInput <- readFile text writeFile ("a"++text) (unlist . proc . lines $ theInput) 

With utf-8 characters, I get the following: hGetContents: invalid argument (invalid byte sequence)

Since the file I'm working with has utf-8 characters, I would like to handle this exception so that, if possible, reuse the functions imported from Ref .

Is there a way to read the utf-8 file as an IO String so that I can reuse my Ref functions? What changes should I make to my code ?. Thanks at Advance.

I am attaching function declarations from my Ref module:

 unlist :: [String] -> String proc :: [String] -> [String] 

from the prelude:

 lines :: String -> [String] 
+8
haskell utf-8
source share
3 answers

Thanks for the answers, but I found the solution myself. In fact, the file I worked with has the following encoding:

 ISO-8859 text, with CR line terminators 

So, in order to work with this file with my haskell code, it must have this encoding:

 UTF-8 Unicode text, with CR line terminators 

You can check the file encoding using the file utility as follows:

 $ file filename 

To change the encoding of a file, follow the instructions from this

+2
source share

This can only be done with the GHC base module (but extended from the standard) System.IO , although you will have to use more functions:

 module Main where import Ref import System.IO main = do text <- getLine inputHandle <- openFile text ReadMode hSetEncoding inputHandle utf8 theInput <- hGetContents inputHandle outputHandle <- openFile ("a"++text) WriteMode hSetEncoding outputHandle utf8 hPutStr outputHandle (unlist . proc . lines $ theInput) hClose outputHandle -- I guess this one is optional in this case. 
+3
source share

Use System.IO.Encoding .

The lack of Unicode support is a well-known issue with the Haskell IO standard library.

 module Main where import Prelude hiding (readFile, getLine, writeFile) import System.IO.Encoding import Data.Encoding.UTF8 main = do let ?enc = UTF8 text <- getLine theInput <- readFile text writeFile ("a" ++ text) (unlist . proc . lines $ theInput) 
+1
source share

All Articles