Regular expression matches only some file names

Using std :: regex and the given file path, I want to match only file names that end in .txt and which do not have the form _test.txt or .txtTEMP . Any other underline is fine.

So for example:

  • somepath/testFile.txt should match.
  • somepath/test_File.txt should match.
  • somepath/testFile_test.txt should not match.
  • somepath/testFile.txtTEMP should not match.

What is the correct regular expression for such a pattern?

What I tried:

(.*?)(\.txt) ---> This matches any file path ending in .txt .

To exclude files containing _test , I tried to use a negative look:

(.*?)(?!_test)(\.txt)

But that did not work.

I also looked for a negative lookbehind, but MSVC14 (Visual Studio 2015) throws a std::regex_error when creating a regex, so I'm not sure if it is not supported or if I use the wrong syntax.

+6
source share
4 answers

depending on what you posted, use this template

 ^(?!.*_).*\.txt$ 

Demo


or this template based on OP editing

 ^(.*(?<!_test)\.txt$) 

Demo

+2
source
 ^(?!.*?_test\.).*\.txt$ 

I don't have access to VS 2015 atm, but this only uses lookahead, so it should work.

+2
source

The best thing? Do not use regular expressions. In particular, in a simplified case of string searches like this.

First, there are a couple of simple optimizations that can be done based on the parameters of the question:

  • Since the extension of the input string must be: ".txt", we do not need to check whether the extension is ".txtTEMP"
  • The only thing that does not meet the condition when the string input ends with "_test.txt" requires checking that the trunk ends with "_test", since the extension is already known as ".txt"

Both of these checks will always be offset by a fixed number of characters from the end of the string input. Since all the information for both of these expressions is known, it must be set at compile time:

 constexpr auto doMatch = ".txt"; constexpr auto doMatchSize = strlen(doMatch); constexpr auto doNotMatch = "_test"; constexpr auto doNotMatchSize = strlen(doNotMatch) + doMatchSize; 

Given a string input , it can be tested for success as follows:

 if(input.size() >= doMatchSize && equal(input.end() - doMatchSize, input.end(), doMatch) && (input.size() < doNotMatchSize || !equal(input.end() - doNotMatchSize, input.end() - doMatchSize, doNotMatch))) 

Here you can see a live example: http://ideone.com/7BcyFi

+1
source

One trick that will emulate the lookbehind that you really need (but, unfortunately, is not supported in C ++ 11) is to cancel the line and then use lookahead. Your regex will become something like

 ^txt\.(?!tset_).* 

The problem with the validation you are trying to do is that it refers to a position where it should also start with a .txt match. part. So, the '(?! _ Test) (. Txt)' part of your regular expression says: "I want something that doesn't start with _test, but matches .txt." Anything that ends with .txt will really match that, so it doesn't work.

Update: regex with negative lookbehind (which will NOT work in C ++, but works, for example, in python):

 ^.*(?<!_test)\.txt$ 
0
source

All Articles