I use awk (mac os x) to print only lines containing n characters or longer.
If I try it in a text file (strings.txt) that looks like this:
four
foo
bar
föö
bår
fo
ba
fö
bå
And I ran this awk script:
awk ' { if( length($0) >= 3 ) print $0 } ' <strings.txt
Conclusion:
four
foo
bar
föö
bår
fö
bå
(The last two lines should not be printed). Words that contain umlaut characters (å, ä, ö ...) seem to be considered two characters.
(The input file is saved in UTF8 format.)
source
share