Normalize data with LINQ

Suppose we have some denormalized data, for example:

List<string[]> dataSource = new List<string[]>(); string [] row1 = {"grandParentTitle1", "parentTitle1", "childTitle1"}; string [] row2 = {"grandParentTitle1", "parentTitle1", "childTitle2"}; string [] row3 = {"grandParentTitle1", "parentTitle2", "childTitle3"}; string [] row4 = {"grandParentTitle1", "parentTitle2", "childTitle4"}; dataSource.Add(row1); 

I need to normalize it, for example. to get an IEnumerable <Child> with Child.Parent and Child.Parent.GrandParent filled.

The imperative method is more or less clear. Will Linq be shorter?

Better in one request, and this should expand for more objects.

I tried something like separately creating an IEnumerable <GrandParent>, then an IEnumerable <Parent> with an assignment, etc.

PLease to make a hint, can this be achieved in a functional way?

+6
linq normalization
source share
3 answers

You can do exactly what you want using the group. Unfortunately, my knowledge of C # LINQ syntax is limited, so I can just show you how to call the GroupBy extension method.

 var normalized = dataSource .GroupBy(source => source[0], (grandParent, grandParentChilds) => new { GrandParent = grandParent, Parents = grandParentChilds .GroupBy(source => source[1], (parent, parentChilds) => new { Parent = parent, Children = from source in parentChilds select source[2]}) }); foreach (var grandParent in normalized) { Console.WriteLine("GrandParent: {0}", grandParent.GrandParent); foreach (var parent in grandParent.Parents) { Console.WriteLine("\tParent: {0}", parent.Parent); foreach (string child in parent.Children) Console.WriteLine("\t\tChild: {0}", child); } } 
+1
source share

Linq really does the opposite of this. i.e. If you normalized it, you could easily say

 from g in grandParents from p in g.Parents from c in p.Children select new { GrandParentName = g.Name, ParentName = p.Name, ChildName = c.Name }; 

Doing what you ask for is more difficult. Something like that

 var grandparents = (from g in dataSource select new GrandParent { Title = g[0], Parents = (from p in dataSource where p[0] == g[0] select new Parent { Title = p[1], Children = from c in dataSource where p[1] == c[1] select new { Title = c[2] } }).Distinct(new ParentTitleComparer()) }).Distinct(new GrandParentTitleComparer()); 

I am not sure if this is better than the strong version.

0
source share

The easiest way to do this would be with anonymous variables:

 from ds0 in dataSource group ds0 by ds0[0] into grandparents select new { Grandparent = grandparents.Key, Parents = from ds1 in grandparents group ds1 by ds1[1] into parents select new { Parent = parents.Key, Children = from ds2 in parents select ds2[2] } }; 

If you want to do this with specific classes, I would suggest creating a Person class with a constructor that accepts an IEnumerable<Person> representing the child elements of Person . Then you can do this:

 from ds0 in dataSource group ds0 by ds0[0] into grandparents select new Person(grandparents.Key, from ds1 in grandparents group ds1 by ds1[1] into parents select new Person(parents.Key, from ds2 in parents select new Person(ds2[2]))); 

Does any of these solutions work for you?

If you need different types of GrandParent , Parent and Child , you should be able to modify the last example.

0
source share

All Articles