Python: how to change metadata of Microsoft Office files?

How to change metadata of a Microsoft Office document? I found the result number for the Jpg, PNG and PDF file. Anyone can suggest library metadata for Office files?

+4
source share
1 answer

For newer formats, they are often just zipped xml, so you can use standard libraries to unpack and parse XML. Some code to capture the creator of the document previously fooobar.com/questions/1083641 / ... .

import zipfile, lxml.etree

# open zipfile
zf = zipfile.ZipFile('my_doc.docx')
# use lxml to parse the xml file we are interested in
doc = lxml.etree.fromstring(zf.read('docProps/core.xml'))
# retrieve creator
ns={'dc': 'http://purl.org/dc/elements/1.1/'}
creator = doc.xpath('//dc:creator', namespaces=ns)[0].text

For older formats you can look at the hachoir-metadata library

+2
source

All Articles