Regular expression for a string containing one word but not another

I set some goals in Google Analytics and can use a little help in regular expression.

Let's say I have 4 URLs

http://www.anydotcom.com/test/search.cfm?metric=blah&selector=size&value=1 http://www.anydotcom.com/test/search.cfm?metric=blah2&selector=style&value=1 http://www.anydotcom.com/test/search.cfm?metric=blah3&selector=size&value=1 http://www.anydotcom.com/test/details.cfm?metric=blah&selector=size&value=1 

I want to create an expression that identifies any URL that contains selector = size but does not contain details.cfm

I know that to search for a string that DOES NOT contain another string, I can use this expression:

 (^((?!details.cfm).)*$) 

But I'm not sure how to add selector = size to the section.

Any help would be greatly appreciated!

+59
regex regex-negation google-analytics
Jun 01 '10 at 20:21
source share
5 answers

This should do it:

 ^(?!.*details\.cfm).*selector=size.*$ 

^.*selector=size.*$ should be clear enough. The first bit, (?!.*details.cfm) is a negative prediction: before matching the line that it checks, the line does not contain "details.cfm" (with any number of characters in front of it).

+80
Jun 01 '10 at 20:26
source share

regex can be (perl syntax):

 `/^[(^(?!.*details\.cfm).*selector=size.*)|(selector=size.*^(?!.*details\.cfm).*)]$/` 
+4
Jun 01 '10 at 20:35
source share
 ^(?=.*selector=size)(?:(?!details\.cfm).)+$ 

If your regex engine supports important quantifiers (although I suspect that Google Analytics does not), I think this will work better for large input sets:

 ^[^?]*+(?<!details\.cfm).*?selector=size.*$ 
+1
Jun 01 '10 at 20:27
source share

I was looking for a way to avoid tail buffering in the same situation as the OP and Kobi solution works fine for me. In my case, excluding the lines with the "bot" or "spider", including the "/" (for my root document).

My initial command:

 tail -f mylogfile | grep --line-buffered -v 'bot\|spider' | grep ' / ' 

Now (with the switch "-P" perl):

 tail -f mylogfile | grep -P '^(?!.*(bot|spider)).*\s\/\s.*$' 
0
Jun 16 '16 at 11:11
source share

An easy way to do this is to specify 0 line instances by doing the following

 (string_to_exclude){0} 
-3
Jul 27 2018-12-12T00:
source share



All Articles