How to remove accents from a string in Haskell?

I need a function to remove accents from a string. I / O Example:

regardé -> regarde fête -> fete 
+7
haskell
source share
1 answer

The text-icu library contains many Unicode utilities. We will also need the text library to convert our String to text . I installed them by adding the following two lines to build-depends in my cabal file:

 build-depends: --- other packages... , text-icu >= 0.7.0.1 && < 1 , text 

With the dependencies established, we can remove accents using the following process:

  • Convert String input to text
  • Normalize input (see documentation , why it is necessary)
  • Filter accents
  • Return to String .

With this in mind, we offer the following function:

 import Data.List import qualified Data.Text as T import Data.Text.ICU.Char import Data.Text.ICU.Normalize canonicalForm :: String -> String canonicalForm s = T.unpack noAccents where noAccents = T.filter (not . property Diacritic) normalizedText normalizedText = normalize NFD (T.pack s) 

If you do not need to convert from String , you can skip calls to T.pack and T.unpack .

+11
source share

All Articles