Getting OpenXmlElements between CommentRangeStart and CommentRangeEnd

Question

Getting OpenXmlElements between CommentRangeStart and CommentRangeEnd

What I'm trying to do is find OpenXMLElements between CommentRangeStart and the corresponding CommentRangeEnd .

I tried to execute two methods, but the problem is that CommentRangeEnd should not be at the same level as at the beginning. It can be nested in a child element, see the very simple structure below (note that this is incorrect open xml, this is just to show the general idea).

 <w:commentstart/> <w:paragraph> <w:run /> <w:commentend /> </w:paragraph>

The two objects I tried are the following: First: I wrote an enumeration that returns the elements to the end

 public static IEnumerable<OpenXmlElement> SiblingsUntilCommentRangeEnd(CommentRangeStart commentStart) { OpenXmlElement element = commentStart.NextSibling(); if (IsMatchingCommentEnd(element, commentStart.Id.Value)) { yield break; } while (true) { yield return element; element = element.NextSibling(); // Check that the item if (element == null) { yield break; } if (IsMatchingCommentEnd(element, commentStart.Id.Value)) { yield break; } } } public static bool IsMatchingCommentEnd(OpenXmlElement element, string commentId) { CommentRangeEnd commentEnd = element as CommentRangeEnd; if (commentEnd != null) { return commentEnd.Id == commentId; } return false; }

Secondly: then, realizing the problem with the beginning and the end, not being at the same level, I continued to hunt, and I found Eric Belykh’s answer for working with elements between bookmark elements I’m retro, which for my example, but still a problem with beginning and end, not having the same parent (that is, at the same level), was a problem, and I could not use it.

Is there a better way to take a look at this. I am looking for a way to handle elements as I need to work with text that is being commented.

Edit: Clarification of what I'm trying to achieve: I take a document edited in words and for the comment in the document I'm looking for to get text that was commented out between the start and end range for a particular comment identifier.

Edit 2: I put in a working version of what I am thinking right now, but my problem is that it is potentially quite fragile with various user combinations from Word. This also works with xml, which is actually not a problem, but might like to change the OpenKML SDK. Currently, it looks as if I need to analyze the entire document, getting the elements I need, and not work with 1 specific comment. https://github.com/mhbuck/DocumentCommentParser/

Main problem: CommentRangeStart and CommentRangeEnd may be in different attachments in the XML document. The root node is potentially the only such ancestor element.

+6

c # openxml openxml-sdk

Mike b Aug 29 '12 at 10:06

source share

1 answer

Lukasz M · Answer 1 · 2012-08-29T19:40:52+0000

You can try to use the Descendants<T>() method to list all the descendants of a node of this type. So your code might look something like this (I wrote it without using yeld to make it more readable;)):

 public static IEnumerable<OpenXmlElement> SiblingsUntilCommentRangeEnd(CommentRangeStart commentStart) { List<OpenXmlElement> commentedNodes = new List<OpenXmlElement>(); OpenXmlElement element = commentStart; while (true) { element = element.NextSibling(); // check that the item exists if (element == null) { break; } //check that the item is matching comment end if (IsMatchingCommentEnd(element, commentStart.Id.Value)) { break; } //check that there is a matching element in the current element descendants var descendantsCommentEnd = element.Descendants<CommentRangeEnd>(); if (descendantsCommentEnd != null) { foreach (CommentRangeEnd rangeEndNode in descendantsCommentEnd) { if (IsMatchingCommentEnd(rangeEndNode, commentStart.Id.Value)) { //matching range end element found in current element descendants //an improvement could be made here to manually select descendants before CommentRangeEnd node break; } } } commentedNodes.Add(element); } return commentedNodes; }

As noted in one comment, it ends if it finds the CommentRangeEnd element in the threads of the current element.

I have not tested this code yet, so if you have any problems with it, let me know in the comments.

Note that this method will not work if the starting element is deeper in the document hierarchy than the ending element. In some cases, it will also not return the part of the content placed in the comment. If you need it, I can later update the answer with an alternative solution to handle this case. Please also explain why you need to find these comments, because an alternative method might be used.

Getting OpenXmlElements between CommentRangeStart and CommentRangeEnd

More articles: