The main problem is that the processed ToC depends on pagination to find out what page number to put for each header. Pagination is a feature provided by the build engine, a very complex piece of software built into the Word client. Writing a page layout engine in Python is probably not a good idea, definitely not the project I plan to undertake in the near future :)
ToC consists of two parts:
- an element that indicates the placement in ToC, and things like those that include header levels.
- actual visible ToC content, titles and page numbers with dashed lines connecting them.
Creating an element is fairly simple and relatively small. Creating actual visible content, at least if you need page numbers, requires a Word layout engine.
These are the following options:
Just add a tag and a few other bits to indicate that Word ToC needs to be updated. When the document is opened first, a dialog box appears in which the links should be updated. The user clicks "Yes" and "Bob" on your uncle. If the user clicks No, the ToC header appears without content below it, and the ToC can be updated manually.
Add a tag and then start the Word client using C # or Visual Basic in the Word Automation library to open and save the file; all fields (including the ToC field) are updated.
Do the same on the server side if you have a SharePoint instance or something else that can be done using Word Automation Services.
Create an AutoOpen macro in a document that automatically starts updating the field when you open the document. Most likely, many virus scans will not pass and will not work on blocked assemblies of Windows that are common in corporate settings.
Here's a very good set of Eric White screencasts that explains all the hairy details
scanny
source share