Match the occurrence of a character before the control character, match zero if no control character

I am working on functionality that allows the user to specify a “wildcard” path for items in a folder hierarchy and the associated action that will be performed when an item matches this path. eg:.

Path Action ----------- ------- 1. $/foo/*/baz include 2. $/foo/bar/* exclude 

Now with the example above, the element in $/foo/bar/baz will correspond to both actions. Given this, I want to give a rough assessment of the specificity of the wildcard pattern, which will be based on the "depth" in which the first wildcard character occurs. The path with the greatest depth will triumph. It is important to note that only * limited by forward slashes ( /*/ ) is allowed as a wildcard (except when /* at the end), and any number can be specified at different points on the path.

TL DR;

So, I think the regex to count the number of slashes before the first * is the way to go. However, for a number of reasons, when there is no pattern in the path, the slash will be zero. I have the following negative view:

  (?<!\*.*)/ 

which works great when there are wildcards (for example, 2 forward slashes for path number 1 above and 3 for # 2), but when there is no wildcard, it naturally matches all the slashes. I am sure this is a simple step to match none, but due to the rusty regular expression skills that I'm stuck with.

Ideally, from an academic point of view, I would like to see if one regular expression can capture this, however bonus points are offered for a more elegant solution to the problem!

+4
source share
3 answers

This will be one way to do this:

 match = Regex.Match(subject, @"^ # Start of string ( # Match and capture in group number 1... [^*/]* # any number of characters except slashes or asterisks / # followed by a slash )* # zero or more times. [^*/]* # Match any additional non-slash/non-asterisk characters. \* # Then match an asterisk", RegexOptions.IgnorePatternWhitespace); 

Now this regular expression cannot match if there is no asterisk in the subject line (rating 0 ). If the regular expression matches, you can be sure that there is at least one asterisk in it.

The clever thing is that .NET regular expressions, unlike most other regular expression flavors, can actually count how many times a re-capture group is repeated (most other regular expression mechanisms simply discard this information), which allows us to determine the number of slashes before The first asterisk in the line.

This information can be found in

 match.Groups[1].Captures.Count 

(Of course, this means that “without the slashes in front of the first asterisk” and “without an asterisk” will both get a score of 0 , which is apparently what you are asking in your question, I don’t know why this makes sense )

+2
source

A method that fits the task:

  • Check all test paths (make sure they are valid and contain either \*\ or end by * ).

  • Use the sorted collection to track test paths and related activities.

  • Sorting the collection based on the position of the template in a row.

  • Test the item for each path in a sorted collection.
    Can you replace * in the string with .*? to use it in regular expression.

  • Stop at the first match and return the associated action, otherwise continue with the next test in the collection.

A quick test implementation of some of the above:

 void Main() { // Define some actions to test and add them to a collection var ActionPaths = new List<ActionPath>() { new ActionPath() {TestPath = "/foo/*/baz", Action = "include"}, new ActionPath() {TestPath = "/foo/bar/*", Action = "exclude"}, new ActionPath() {TestPath = "/foo/doo/boo", Action = "exclude"}, }; // Sort the list of actions based on the depth of the wildcard ActionPaths.Sort(); // the path for which we are trying to find the corresponding action string PathToTest = "/foo/bar/baz"; // Test all ActionPaths from the top down until we find something var found = default(ActionPath); foreach (var ap in ActionPaths) { if (ap.IsMatching(PathToTest)) { found = ap; break; } } // At this point, we have either found an Action, or nothing at all if (found != default(ActionTest)) { // Found an Action! } else { // Found nothing at all :-( } } // Hold and Action Test class ActionPath : IComparable<ActionPath> { public string TestPath; public string Action; // Returns true if the given path matches the TestPath public bool IsMatching(string path) { var t = TestPath.Replace("*",".*?"); return Regex.IsMatch(path, "^" + t + "$"); } // Implements IComparable<T> public int CompareTo(ActionPath other) { if (other.TestPath == null) return 1; var ia = TestPath.IndexOf("*"); var ib = other.TestPath.IndexOf("*"); if (ia < ib) return 1; if (ia > ib) return -1; return 0; } } 
+1
source

There is no need for regular expressions.

With LINQ, this is a 2-liner:

 string s = "$/foo/bar/baz"; var asteriskPos = s.IndexOf('*'); // will be -1 if there is no asterisk var slashCount = s.Where((c, i) => c == '/' && i < asteriskPos).Count(); 
+1
source

All Articles