How to properly generate RSID attributes in Word.docx files using Apache POI?

I am using Apache POI to manipulate Microsoft Word.docx files - i.e. open a document that was originally created in Microsoft Word, change it, save it in a new document.

I notice that the new paragraphs created by Apache POI do not have a recovery identifier, often referred to as RSID or rsidR. This is used by Word to identify changes made to a document in one session, for example between saving. This is not necessary - users can disable it in Microsoft Word if they want, but in fact almost everyone has it, so almost every document is filled with RSID. Read this excellent RSID explanation for more information.

In a Microsoft Word document, word/document.xml contains the following paragraphs:

 <w:pw:rsidR="007809A1" w:rsidRDefault="007809A1" w:rsidP="00191825"> <w:r> <w:t>Paragraph of text here.</w:t> </w:r> </w:p> 

However, the same paragraph created by the POI will look like this: word/document.xml :

 <w:p> <w:r> <w:t>Paragraph of text here.</w:t> </w:r> </w:p> 

I realized that I can make the POI add an RSID to each paragraph using this code:

  byte[] rsid = ???; XWPFParagraph paragraph = document.createParagraph(); paragraph.getCTP().setRsidR(rsid); paragraph.getCTP().setRsidRDefault(rsid); 

However, I do not know how I should generate the RSID.

Does the POI have a way to either generate and / or track the RSID? If not, is there any way to guarantee that the RSID that I create does not conflict with what is already in the document?

+6
java docx apache-poi
source share
1 answer

It looks like the list of valid rsid entries is stored in word / settings.xml in the <w:rsids> . XWPF should be able to give you access to this already.

You will probably want to generate an 8-bit long random digit with six digits, check if there is one, and generate it if there is one. Once you have a unique one, add it to this list and then mark it with paragraphs.

What I propose is that you join the poi dev list (details of the mailing list) , and we can give you a hand at work patch for it. I think the following should be done:

  • Wrap around the RSids entry in word / settings.xml so you can easily get the list and create a new (unique)
  • Wrap around various RSID entries in a paragraph and run
  • Paragraph and run methods to get an RSid wrapper, add a new one, or clear an existing one

We should take this to the list of developers, though :)

+4
source share

All Articles