Stand out as part of a line in linq

Given this collection:

var list = new [] { "1.one", "2. two", "no number", "2.duplicate", "300. three hundred", "4-ignore this"}; 

How can I get a subset of elements starting with a number followed by a period ( regular expression @"^\d+(?=\.)" ) With different digits? I.e:

 {"1.one", "2. two", "300. three hundred"} 

UPDATE:

My attempt was to use IEqualityComparer to jump to the Distinct method. I took this GenericCompare class and tried the following code to no avail:

 var pattern = @"^\d+(?=\.)"; var comparer = new GenericCompare<string>(s => Regex.Match(s, pattern).Value); list.Where(f => Regex.IsMatch(f, pattern)).Distinct(comparer); 
+4
regex linq linq-to-objects
Aug 26 '14 at 19:05
source share
4 answers

If you need a Linq approach, you can try adding a named capture group to the regular expression, and then filter out the elements that match the regular expression, group by the captured number, and finally get only the first line for each number. I like the readability of the solution, but I won’t be surprised if there is a more efficient way to eliminate duplicates, let's see if anyone has a different approach.

Something like that:

 list.Where(s => regex.IsMatch(s)) .GroupBy(s => regex.Match(s).Groups["num"].Value) .Select(g => g.First()) 

You can try with this example:

 public class Program { private static readonly Regex regex = new Regex(@"^(?<num>\d+)\.", RegexOptions.Compiled); public static void Main() { var list = new [] { "1.one", "2. two", "no number", "2.duplicate", "300. three hundred", "4-ignore this" }; var distinctWithNumbers = list.Where(s => regex.IsMatch(s)) .GroupBy(s => regex.Match(s).Groups["num"].Value) .Select(g => g.First()); distinctWithNumbers.ToList().ForEach(Console.WriteLine); Console.ReadKey(); } } 

You can try using it in this fiddle.

As pointed out by @orad comments, there is a Linq DistinctBy() extension in MoreLinq that can be used instead of grouping, and then getting the first element in the group to eliminate duplicates:

 var distinctWithNumbers = list.Where(s => regex.IsMatch(s)) .DistinctBy(s => regex.Match(s).Groups["num"].Value); 

Try this fiddle

EDIT

If you want to use your comparator, you need to implement GetHashCode so that it also uses the expression:

 public int GetHashCode(T obj) { return _expr.Invoke(obj).GetHashCode(); } 

Then you can use a comparator with a lambda function that takes a string and gets a number using a regular expression:

 var comparer = new GenericCompare<string>(s => regex.Match(s).Groups["num"].Value); var distinctWithNumbers = list.Where(s => regex.IsMatch(s)).Distinct(comparer); 

I created another fiddle using this approach.

Using lookahead regex

You can use either of these two approaches with the regular expression @"^\d+(?=\.)" .

Just change the lambda expressions to get the group "num" s => regex.Match(s).Groups["num"].Value with an expression that gets the regular expression s => regex.Match(s).Value

Updated fiddle here .

+4
Aug 26 '14 at 19:50
source share

(I could mark this as an answer too)

This solution works without rerunning regular expressions:

 var regex = new Regex(@"^\d+(?=\.)", RegexOptions.Compiled); list.Select(i => { var m = regex.Match(i); return new KeyValuePair<int, string>( m.Success ? Int32.Parse(m.Value) : -1, i ); }) .Where(i => i.Key > -1) .GroupBy(i => i.Key) .Select(g => g.First().Value); 

Run it in this script .

+1
Aug 26 '14 at 10:59
source share

Your decision is good enough.

You can use LINQ query syntax to avoid re-starting regular expressions with the let keyword as follows:

 var result = from kvp in ( from s in source let m = regex.Match(s) where m.Success select new KeyValuePair<int, string>(int.Parse(m.Value), s) ) group kvp by kvp.Key into gr select new string(gr.First().Value); 
0
Jan 30 '19 at 15:07
source share

Something like this should work:

 List<string> c = new List<string>() { "1.one", "2. two", "no number", "2.duplicate", "300. three hundred", "4-ignore this" }; c.Where(i => { var match = Regex.Match(i, @"^\d+(?=\.)"); return match.Success; }); 
-one
Aug 26 '14 at 19:51
source share



All Articles