Short answer: you cannot do what you ask. Technically, the first part has an ugly answer, but the second part (as I understand it) has no answer.
For your first part, I have a rather impractical (but pure regex) response; anything better would require code (e.g. @rednaw a much cleaner answer above). I added to the test to make it more complete. (For simplicity, I use grep -Pio for PCRE, case insensitive, prints one match per line.)
$ echo "Ben sits on a bench better end" \ |grep -Pio '(?=b(?!en)|(?<!b)en|e(?!n)|(?<!be)n|[^ben])\w+' sits on a ch better end
I basically do a special case for any letter in "ben", so I can only include iterations that themselves are not part of the string "ben". As I said, itβs not very practical, even if I technically answer your question. I also kept a step-by-step explanation of this regex if you would like more information.
If you are forced to use a purely regular expression rather than code, the best choice for such elements is to write code to generate a regular expression. This way you can keep a blank copy.
I'm not sure what you are asking for the remainder of your task; the regular expression is either greedy or lazy [1] [2] , and I don't know any implementations that can find "every combination", and not just the first combination by any method. If this were so, in real life it would be very slow (rather than quick examples); the slow speed of regex engines would be unbearable if they were forced to explore every opportunity, which would basically be ReDoS .
Examples:
# greedy evaluation (default) $ echo 1a2be3 |grep -Pio '(?!\d[az]\d)\w+' a2be3
I assume that you are looking for 1 1a a a2 a2b a2be a2be3 2 2b 2be 2be3 b be be3 e e3 3 , but I do not think you can get this with a pure regex. You will need code to generate each substring, and then you can use the regular expression to filter the forbidden pattern (again, it's all about greedy vs lazy vs ReDoS).
Adam katz
source share