Detect duplicate records, select only the first and count using LINQ / C #

I am looking for a little help in developing a query using C # / LINQ to satisfy the following requirements:

I have a list of companies: -

Id Name Email Address 1 Company A a@a.com abc 2 Company B b@b.com abc 3 Company C c@c.com abc 4 Company D d@d.com abc 5 Company A a@a.com abc 

My goal is to detect duplicate elements based on two fields, in this example β€œname” and β€œemail”.

The desired output is a list of customers, shown below:

  • You can duplicate shuold customers only once.
  • You must specify the number of such entries.

Desired duplicate list: -

 Id Qty Name Email Address 1 2 Company A a@a.com abc (Id/details of first) 2 1 Company B b@b.com abc 3 1 Company C c@c.com abc 4 1 Company D d@d.com abc 
+7
source share
2 answers

If you explicitly want to use the record with the lowest id in each set of duplicates, you can use

 var duplicates = companies .GroupBy(c => new { c.Name, c.Email }) .Select(g => new { Qty = g.Count(), First = g.OrderBy(c => c.Id).First() } ) .Select(p => new { Id = p.First.Id, Qty = p.Qty, Name = p.First.Name, Email = p.First.Email, Address = p.First.Address }); 

If you do not care what record values ​​are used, or if your source is already sorted by ID (ascending), you can refuse to call OrderBy .

+9
source
 from c in companies group c by new { c.Name, c.Email } into g select new { Id = g.First().Id, Qty = g.Count(), Name = g.Key.Name, Email = g.Key.Email, Address = g.First().Address }; 
+4
source

All Articles