Why can't I adjust the underscore character '\ W' in Python?

I know that _ cannot be matched \W , while any other punctuation can. As stated in the docs: \W is a set of alphanumeric characters and underscores .

In the same time:

enter image description here

I was always embarrassed by this, but really did not wonder why.

Is this related to the special role _ plays in Python?

+6
source share
1 answer

A lot of the Python regex syntax in the re module comes from Perl, which was influenced by sed and awk . \w comes from there and has a long history.


In the regex source module (which was deprecated in Python 1.5), \w did not include _ , as seen in the Python 1.4 documentation :

\w

Matches any alphanumeric character; this is equivalent to the set [a-zA-Z0-9] .


PS Although this is not very convenient, you can combine everything that \w + _ with the character class [\W_] .

+4
source

All Articles