Print POSIX character class

For a class like

[:digit:]

I would like the result to be

0123456789

Note that the method should work for all POSIX character classes. Here is what I tried

$ printf %s '[:digit:]'
[:digit:]

& sect; Character classes

+4
source share
4 answers
$ seq 126 | awk '{printf "%c", $0}' | grep -o '[[:digit:]]'
0
1
2
3
4
5
6
7
8
9
+1
source

I am sure there is a better way, but here is brute force method:

for i in {0..127}; do 
    char=$(printf \\$(printf '%03o' "$i"))
    [[ $char =~ [[:alpha:]] ]] && echo "$char"
done

Scroll through all decimal signed values, convert them to the corresponding ASCII character, and test them against the character class.

The range may be wrong, but the check seems to work.

, == =~, .

+4

POSIX . grep re_format man page.

ASCII. , , [[:digit:]] 0 9. ٠ ٩ ۰ ۹ 1 . , .

, , . . .

. TRS80 PDP11, , ASCII. , 127 ( 256) . Mac Linux, , Unicode, UTF8.

Windows 256 . CP1252 , . , Windows Unicode UTF8. Windows UTF16 .

, . script , .


1 , , , , .

+1

, Unicode 4.0 :

for((i=0; i < 0x110000; i++)) {
  printf "\U$(printf "%x" $i)\n"; 
}  | grep -a '^[[:alpha:]]$'

:

  • , $'E\U0301', , ( É). , , .

  • cntrl, .

  • Ruby symbols that I cannot display in Stack Overflow. Fortunately, they are usually outdated in favor of proper markup.

  • It is slow.

A better approach would be to try to interpret the platform language definition files, but it depends on the platform.

0
source

All Articles