Java regex for any character?

Is there a regex that accepts any character?

EDIT: To find out what I'm looking for. I want to create a regular expression that will accept any number of spaces, and it must contain at least one character (for example, "" $ ยฃ, etc.) or (not an exclusive or) at least 1 character.

+4
source share
2 answers

Yes. A period ( . ) Will match any character, at least if you use it in conjunction with the Pattern.DOTALL flag (otherwise it will not match newline characters). From the docs:

In hitherto expression. matches any character, including a line terminator. By default, this expression does not match string terminators.


Regarding your editing:

I want to create a regular expression that will accept any number of spaces, and it must contain at least one character (for example, "" $ ยฃ, etc.) or (not an exclusive or) at least 1 character.

Here is a suggestion:

 \s*\S+ 
  • \s* any number of whitespace characters
  • \S+ one or more ("at least one") non-whitespace characters.
+7
source

In Java, the character is \pS , which does not match the punctuation characters, which are \pP .

I am talking about this problem, plus I list the types for all ASCII punctuation marks and characters here in this answer .

Templates like [\p{Alnum}\s] only work with an outdated dataset from the 1960s. To work with things with installed Java character sets, you need something of the order

 identifier_charclass = "[\\pL\\pM\\p{Nd}\\p{Nl}\\p{Pc}[\\p{InEnclosedAlphanumerics}&&\\p{So}]]"; whitespace_charclass = "[\\u000A\\u000B\\u000C\\u000D\\u0020\\u0085\\u00A0\\u1680\\u180E\\u2000\\u2001\\u2002\\u2003\\u2004\\u2005\\u2006\\u2007\\u2008\\u2009\\u200A\\u2028\\u2029\\u202F\\u205F\\u3000]"; ident_or_white = "[" + identifier_charclass + whitespace_charclass + "]"; 

I'm sorry that Java is so difficult to work with a modern dataset, but at least it's possible.

Just don't ask about borders or grapheme clusters. For this, see my other publications .

0
source

All Articles