POSIX Character Values โ€‹โ€‹in Java Regular Expressions

I would like to use a regex like this in Java: [[=a=][=e=][=i=]] .

But Java does not support the POSIX classes [=a=], [=e=] etc

How can i do this? More precisely, is there a way not to use US-ASCII?

+4
source share
3 answers

Java supports posix character classes . The syntax is just different, for example:

 \p{Lower} \p{Upper} \p{ASCII} \p{Alpha} \p{Digit} \p{Alnum} \p{Punct} \p{Graph} \p{Print} \p{Blank} \p{Cntrl} \p{XDigit} \p{Space} 
+10
source

Quote from http://download.oracle.com/javase/1.6.0/docs/api/java/util/regex/Pattern.html

POSIX Character Classes (US-ASCII Only)

 \p{Lower} A lower-case alphabetic character: [az] \p{Upper} An upper-case alphabetic character:[AZ] \p{ASCII} All ASCII:[\x00-\x7F] \p{Alpha} An alphabetic character:[\p{Lower}\p{Upper}] \p{Digit} A decimal digit: [0-9] \p{Alnum} An alphanumeric character:[\p{Alpha}\p{Digit}] \p{Punct} Punctuation: One of !"#$%&'()*+,-./:;<=> ?@ [\]^_`{|}~ \p{Graph} A visible character: [\p{Alnum}\p{Punct}] \p{Print} A printable character: [\p{Graph}\x20] \p{Blank} A space or a tab: [ \t] \p{Cntrl} A control character: [\x00-\x1F\x7F] \p{XDigit} A hexadecimal digit: [0-9a-fA-F] \p{Space} A whitespace character: [ \t\n\x0B\f\r] 
+5
source

Copied from here

Java does not support POSIX console expressions, but supports POSIX character classes using the \ p operator. Although the \ p syntax is derived from the syntax for the Unicode property, the Java POSIX classes only match ASCII characters, as indicated below. Class names are a delicate case. Unlike POSIX syntax, which can only be used inside the expression in brackets, Java \ p can be used inside and outside the expression bracket.

+1
source

All Articles