The regular expression \ p {L} and \ p {N}

I am new to regular expressions and got the following regular expression:

(\p{L}|\p{N}|_|-|\.)* 

I know what * and | means "or" and that \ escapes.

But I don't know what \p{L} and \p{N} mean. I searched google for it, with no result ...

Can someone help me?

+53
xml regex character-class
Feb 15 '13 at 9:01
source share
2 answers

\p{L} matches one code point in the letter category.
\p{N} matches any kind of numeric character in any script.

Source: regular-expressions.info

If you are going to work with regular expressions, I suggest bookmarks on this site, it is very useful.

+83
Feb 15 '13 at 9:03
source share

These are Unicode property shortcuts ( \p{L} for Unicode letters, \p{N} for Unicode digits). They are supported by .NET, Perl, Java, PCRE, XML, XPath, JGSoft, Ruby (1.9 and higher) and PHP ( since 5.1.0 )

Anyway, this is a very strange regular expression. You should not use alternation if a sufficient character class is:

 [\p{L}\p{N}_.-]* 
+16
Feb 15 '13 at 9:06
source share



All Articles