Python: how to change metadata of Microsoft Office files?

Question

Python: how to change metadata of Microsoft Office files?

How to change metadata of a Microsoft Office document? I found the result number for the Jpg, PNG and PDF file. Anyone can suggest library metadata for Office files?

+4

file python-3.x python-2.7 metadata

Ravi gohel Jun 01 '16 at 4:59

source share

1 answer

craigts · Accepted Answer · 2016-06-02T05:37:58+0000

For newer formats, they are often just zipped xml, so you can use standard libraries to unpack and parse XML. Some code to capture the creator of the document previously fooobar.com/questions/1083641 / ... .

import zipfile, lxml.etree

# open zipfile
zf = zipfile.ZipFile('my_doc.docx')
# use lxml to parse the xml file we are interested in
doc = lxml.etree.fromstring(zf.read('docProps/core.xml'))
# retrieve creator
ns={'dc': 'http://purl.org/dc/elements/1.1/'}
creator = doc.xpath('//dc:creator', namespaces=ns)[0].text

For older formats you can look at the hachoir-metadata library

Python: how to change metadata of Microsoft Office files?

More articles: