Exclude .txt files.

I want to exclude the ".txt" files of the regex directory (and only regex). But this code does not work, and I do not understand why. I have this list:

['/var/tmp/COMMUN/4.1.0_41/Apache',
 '/var/tmp/COMMUN/4.1.0_41/META-INF', 
 '/var/tmp/COMMUN/4.1.0_41/RewriteRules',
 '/var/tmp/COMMUN/4.1.0_41/Robots', 
 '/var/tmp/COMMUN/4.1.0_41/smokeTest',
 '/var/tmp/COMMUN/4.1.0_41/tutu.txt']

And I try this code

# list_dit is a personal function
list_dir(toto, filter_function=lambda x: re.match("^.*(?!txt)$", x))

Does anyone see what is wrong?

+4
source share
4 answers

Usually they .*are greedy matches, they will match as much as they can, followed by a match. Since an empty string matches a match for (?!txt), it .*will simply match the entire string, which means that this regular expression will match every string.

.*\.txt$ re.match .

, re.match, regex . , . re, , , , re.match . , , "", , , , . EOL .

+4

, ^.*(?!txt)$ , , , - .* $ , , , ( ).

, , lookbehind, (^.*(?<!txt)$)

, lookbehind , Python (lookahead ).
lookbehinds (.. .*, .{0,10}), ( JavaScript) lookbehind.

( JS, .*txt$, .)

lookahead lookbehind, : http://www.regular-expressions.info/lookaround.html

(, \.txt , , , , x.endswith('. txt').

+4

?

x.endswith(".txt")

regeex:

not re.search("\\.txt$", x)
  • not don't match
  • \\. one point .
  • txt
  • $ end of input
+2
source

Does anyone see what is wrong?

^corresponds to the beginning of the line, then .*corresponds to each individual character in the input line, now it goes (?!txt), and there is nothing left in this line, but the end of the line therefore passes each time, after which it $corresponds to the end of the line.

You can fix this with a much simpler regular expression:

list_dir(toto, filter_function=lambda x: not re.search(r"\.txt$", x))
+1
source

All Articles