XML elements that normalize

I have some XML, which is a permutation between, for example, members from 4 sets (A, B, C, D). Suppose A = {A1, A2}, B = {B1}, C = {C1, C2} and D = {D1, D2, D3}, but the current XML is not normal, because these members are combined in an irregular manner in each answer. The "set" attribute shows the name of the set, and the "member" shows each member of each set. This XML is like below:

<root> <phrase permutation=ABCD> <ans number=1> <word set=A member=A1/> <word set=A member=A2/> <word set=B member=B1/> <word set=C member=C1/> <word set=D member=D2/> </ans> <ans number=2> <word set=A member=A1/> <word set=B member=B1/> <word set=C member=C1/> <word set=C member=C2/> <word set=C member=C3/> <word set=D member=D1/> <word set=D member=D3/> </ans> </phrase> </root> 

and I want to put each permutation in a specific answer. Each answer must start with one member A and End with one member D and use only one element of the sets B and C. between them. For example, the answer A1A2B1C1D2 should be separate for A1B1C1D2, A2B1C1D2, and the answer A1B1C1C2C3D1D3 should be divided into A1B1C1D1, A1B1C1D2, , A1B1C2D3, A1B1C3D1 and A1B1C3D3, the final XML is like, for example, below the XML:

 <root> <phrase permutation=ABCD> <ans number=1> <word set=A member=A1/> <word set=B member=B1/> <word set=C member=C1/> <word set=D member=D2/> </ans> <ans number=2> <word set=A member=A2/> <word set=B member=B1/> <word set=C member=C1/> <word set=D member=D2/> </ans> <ans number=3> <word set=A member=A1/> <word set=B member=B1/> <word set=C member=C1/> <word set=D member=D1/> </ans> <ans number=4> <word set=A member=A1/> <word set=B member=B1/> <word set=C member=C1/> <word set=D member=D3/> </ans> <ans number=5> <word set=A member=A1/> <word set=B member=B1/> <word set=C member=C2/> <word set=D member=D1/> </ans> <ans number=6> <word set=A member=A1/> <word set=B member=B1/> <word set=C member=C2/> <word set=D member=D3/> </ans> <ans number=7> <word set=A member=A1/> <word set=B member=B1/> <word set=C member=C3/> <word set=D member=D1/> </ans> <ans number=8> <word set=A member=A1/> <word set=B member=B1/> <word set=C member=C3/> <word set=D member=D3/> </ans> </phrase> </root> 

I hope my question will be clear and you can help me. Thanks

+6
source share
1 answer

Well, first of all: note that there are no quotes in your XML attributes, so the standard XML processing in the standard XML format will not be able to read them out of the box - I just adjusted them to write below.

 var original = XDocument.Parse(/* your XML as string */); var normalized = new XDocument(original); foreach (var phraseNode in normalized.Root.Elements("phrase")) { phraseNode.Elements().Remove(); int ansNo = 1; foreach(var answer in original.Root .Elements("phrase") .Single(p => p.Attribute("permutation").Value == phraseNode.Attribute("permutation").Value) .Elements("ans")) { var groupedWords = answer.Elements("word") .GroupBy(w => w.Attribute("set").Value) .ToArray(); var newAnswers = groupedWords.Skip(1) .Aggregate( groupedWords[0].Select(w => Enumerable.Repeat(w, 1)), (combinations, newWords) => combinations.Join(newWords, c => 1, w => 1, (c, w) => c.Concat(new[] { w }))); foreach (var newAnswer in newAnswers) { var ansNode = new XElement("ans", new XAttribute("number", ansNo++)); ansNode.Add(newAnswer.Select(w => new XElement(w)).ToArray()); phraseNode.Add(ansNode); } } } 

If you do not know LINQ to XML, this may be a little intimidating; we hope that with some easy reading or prior knowledge, the only more complex (relatively speaking, of course!) bit may be the actual code that generates the permutations (the part where newAnswers var is initialized) - you can either take this at par or try to read a bit more about how LINQ integrates.

Also, note that this was not written in the light of large optimizations; in 99.99% of cases this should not be a problem, I hope.

+5
source

All Articles