How to find page number from paragraph using OpenXML?

For a Paragraph object, how to determine which page it is on using the Open XML SDK 2.0 for Microsoft Office?

+4
source share
3 answers

It is not possible to get page numbers for a text document using OpanXml Sdk because it is being processed by the client (for example, MS Word).

However, if the document you are working with is previously opened by a text client and saved back, the client will add LastRenderedPageBreak to identify page breaks. See my answer here for more information on LastRenderedPageBreak s. This allows you to count the number of LastRenderedPageBreak elements LastRenderedPageBreak front of your paragraph to get the current page.

If this is not the case, then the noddy option to work with your requirement is to add footers with page numbers (maybe the same color as your documents to hide it!). Only option - if you automate the creation of a document using OpenXML sdk .

+5
source

@Flowerking: thanks for the info.

Since I need all the loops to find a specific row, I can use the following code to find the page number:

 using (var document = WordprocessingDocument.Open(@"c:\test.docx", false)) { var paragraphInfos = new List<ParagraphInfo>(); var paragraphs = document.MainDocumentPart.Document.Descendants<Paragraph>(); int pageIdx = 1; foreach (var paragraph in paragraphs) { var run = paragraph.GetFirstChild<Run>(); if (run != null) { var lastRenderedPageBreak = run.GetFirstChild<LastRenderedPageBreak>(); var pageBreak = run.GetFirstChild<Break>(); if (lastRenderedPageBreak != null || pageBreak != null) { pageIdx++; } } var info = new ParagraphInfo { Paragraph = paragraph, PageNumber = pageIdx }; paragraphInfos.Add(info); } foreach (var info in paragraphInfos) { Console.WriteLine("Page {0}/{1} : '{2}'", info.PageNumber, pageIdx, info.Paragraph.InnerText); } } 
+2
source

Here is the extension method I made for this:

  public static int GetPageNumber(this OpenXmlElement elem, OpenXmlElement root) { int pageNbr = 1; var tmpElem = elem; while (tmpElem != root) { var sibling = tmpElem.PreviousSibling(); while (sibling != null) { pageNbr += sibling.Descendants<LastRenderedPageBreak>().Count(); sibling = sibling.PreviousSibling(); } tmpElem = tmpElem.Parent; } return pageNbr; } 
0
source

All Articles