lines "\none\ntwo\nt...">

Inconsistent line and word behavior

Here is the GHCi session:

Prelude> words " one two three" ["one","two","three"] Prelude> lines "\none\ntwo\nthree" ["","one","two","three"] 

Is there a reason for this inconsistency? And if so, what is it?

+5
source share
1 answer

lines is an actual bijection: you can use it to split any line into '\n' characters, and then fully assemble them using unlines . (Well, almost: let the trailing newlines and the end of the Windows line be ignored.)

If words had the same behavior with only ' ' instead of '\n' as the delimiter character, this would not work exactly the way we want it: for example, a string

  "I will not buy this record\nit is scratched" 

will be divided into

  ["I","will","not","buy","this","record\nit","is","scratched"] 

which words escapes by breaking in any space.

 Prelude> words "I will not buy this record\nit is scratched" ["I","will","not","buy","this","record","it","is","scratched"] 

This means that a) this is not a bijection, since the smell of a space is lost, and b) you will get many โ€œempty wordsโ€ when there are two adjacent adjacent characters.

Therefore, the reasonable behavior for words is to simply condense such spaces in one space.

+9
source

All Articles