List of fonts used by Word Document (faster method)

I am working on a document verification process to ensure that they comply with corporate standards. One step is to make sure that the Word document is not using invalid fonts.

I have the following code code that works:

Dim wordApplication As Word.ApplicationClass = New Word.ApplicationClass() Dim wordDocument As Word.Document = Nothing Dim fontList As New List(Of String)() Try wordDocument = wordApplication.Documents.Open(FileName:="document Path") 'ive also tried using a for loop with an integer counter, no change in speed' For Each c As Word.Range In wordDocument.Characters If Not fontList.Contains(c.Font.Name) Then fontList.Add(c.Font.Name) End If Next 

But it is incredibly slow! Incredibly slow = 2500 characters per minute (I dated it to StopWatch). Most of my files are about 6,000 words / 30,000 characters (about 25 pages). But there are several documents that are on 100 pages ...

Is there a faster way to do this? I have to support Office 2003 format files, so the Open XML SDK is not an option.

- UPDATE -

I tried to run this as an @ord macro (using the found code @ http://word.tips.net/Pages/T001522_Creating_a_Document_Font_List.html ) and it works much faster (less than a minute). Unfortunately, for my purposes, I do not believe that a macro will work.

- UPDATE # 2 -

I took Chris's advice and converted the document to Open XML format on the fly. Then I used the following code to find all RunFonts objects and read the font name:

  Using docP As WordprocessingDocument = WordprocessingDocument.Open(tmpPath, False) Dim runFonts = docP.MainDocumentPart.Document.Descendants(Of RunFonts)().Select( Function(c) If(c.Ascii.HasValue, c.Ascii.InnerText, String.Empty)).Distinct().ToList() fontList.AddRange(runFonts) End Using 
+6
ms-word office-interop
source share
7 answers

You may need to support Office 2003, but this does not mean that you need to parse it in this format. Take Office 2003 documents, temporarily convert them to DOCX files, open it as a ZIP file, /word/fontTable.xml file and then delete the DOCX.

+5
source share

This is the wrong way, I think. We are looking for the inclusion of a font, not the location of this font. This is an existential rather than a positional problem.

Significantly, much, much faster, repeat the fonts. The only trick is that the Word is sometimes fussy about spaces, etc. It works well for me

 Sub FindAllFonts() Dim lWhichFont As Long, sTempName As String, sBuffer As String For lWhichFont = 1 To FontNames.Count sTempName = FontNames(lWhichFont) If FindThisFont(sTempName) Then sBuffer = sBuffer & "Found " & sTempName & vbCrLf Else If FindThisFont(Replace(sTempName, " ", "")) Then sBuffer = sBuffer & "Found " & sTempName & " *" & vbCrLf End If End If Next Documents.Add Selection.TypeText Text:=sBuffer End Sub Function FindThisFont(sName As String) As Boolean Selection.HomeKey Unit:=wdStory Selection.Find.ClearFormatting With Selection.Find .Font.Name = sName .Forward = True .Format = True .Execute If .Found() Then FindThisFont = True Else FindThisFont = False End If End With End Function 

It works very fast (the only slow component is font iteration)

(Of course, he will not find fonts not on your system, but if you are trying to prepare for transport what you wrote, and some kind of assistant program installed Helvetica or MS Minchin, you can find it)

OK, people told me that not everyone wants this, people want to find fonts that are not on their machines. But the other way is still too slow and requires looking for a lot of things not there. So here is an alternative that is saved as rtf and processes the rtf header.

 Sub FIndAllFonts2() Dim sOldName As String, sOldPath As String, sBuffer As String, sBuffer2 As String, sOut As String, sBit Dim lCounter As Long, lPos As Long, lPos2 As Long, lStopAt As Long Dim objPic As InlineShape, objShp As Shape ' rememer old name for reloading sOldName = ActiveDocument.Path 'delete image to make file out small For Each objPic In ActiveDocument.InlineShapes: objPic.Delete: Next For Each objShp In ActiveDocument.Shapes: objShp.Delete: Next ActiveDocument.SaveAs "tempout.rtf", wdFormatRTF sTempFile = ActiveDocument.Path ActiveDocument.Close lPos2 = 1 ' we only want the header, but we don't know how long the file is 'I am using a Mac, so filesystemobject not available ' if you end up having a huge header, make 2500 bigger lStopAt = 2500 Open sTempFile For Input As #1 Do While Not EOF(1) And lPos2 < lStopAt sBit = Input(1, #1) sBuffer = sBuffer & sBit lPos2 = lPos2 + 1 Loop Close #1 'delete temp file Kill sTempFile ' loop through header, fonts identified in the table as {\f1\ ' if you have more than 100 fonts, make this bigger ' not all numbers are used lStopAt = 100 For lCounter = 1 To lStopAt lPos = InStr(sBuffer, "{\f" & lCounter & "\") If lPos > 0 Then sBuffer = Mid(sBuffer, lPos) lPos = InStr(sBuffer, ";") sBuffer2 = Left(sBuffer, lPos - 1) 'this is where you would look for the alternate name if you want it lPos = InStr(sBuffer2, "{\*\falt") If lPos > 0 Then sBuffer2 = Left(sBuffer2, lPos - 1) sBuffer2 = Mid(sBuffer2, InStrRev(sBuffer2, " ") + 1) & " *" 'indicate it is the shorter ascii name Else sBuffer2 = Mid(sBuffer2, InStrRev(sBuffer2, "}") + 1) End If sOut = sOut & sBuffer2 & vbCrLf End If Next 'reopen old file Documents.Open sOldName Set newdoc = Documents.Add sOut = "Fonts in use in document " & sOldName & vbCrLf & sOut Selection.TypeText Text:=sOut End Sub 

It goes through my 350 page project in 20 seconds on a MacBook Pro. Therefore, it is fast enough to be useful.

+2
source share

If you want all fonts to be used in your document. you could just get all of them on one line using OPEN XML:

  using (WordprocessingDocument doc = WordprocessingDocument.Open(filePath, true)) { var fontlst = doc.MainDocumentPart.FontTablePart.Fonts.Elements<Font>(); } 

Each Font element has its own "Name" property, which is referenced by the element in the properties of the text run.

Hint: you should consider that every word is doc. does not have more than 2 parts of the font table, one in the main part and the other in the glossary, so be sure to also consider the glossary if necessary.

You can download the OPEN XML SDK from here

+1
source share

You can speed up the process by repeating paragraphs. Only if the paragraph contains mixed fonts, you will need to check the character by character. The Name, Bold, Italic, etc. properties have a special "undefined" value, which is an empty string for the name or 9999999 for style attributes.

So, for example, if Bold = 9999999, this means that the paragraph contains some bold and some non-bold characters.

I include the following snippet to show the general idea:

 For Each P as Paragraph in doc.Paragraphs Dim R as Range = P.Range If R.Font.Name = "" Or R.Font.Bold = 9999999 Or R.Font.Italic = 9999999 Or R.Font.Underline = 9999999 Or R.Font.Strikethrough = 9999999 Then ' This paragraph uses mixed fonts, so we need to analyse character by character AnalyseCharacterByCharacter(R) Else ' R.Font is used throughout this paragraph FontHasBeenUsed(R.Font) End If Next 
+1
source share

Another way that I found without encoding is: * Export the document in PDF format * open it in adobe reader * in adobe reader goto: file menu \ properties, and then the fonts tab, which lists the family fonts and sub-fonts used in the document.

+1
source share

Try the following:

 Sub Word_Get_Document_Fonts() Dim report As String Dim J As Integer Dim font_name As String report = "Fonts in use in this document:" & vbCr & vbCr For J = 1 To FontNames.Count font_name = FontNames(J) Set myrange = ActiveDocument.Range myrange.Find.ClearFormatting myrange.Find.Font.Name = font_name With myrange.Find .text = "^?" .Replacement.text = "" .Forward = True .Wrap = wdFindStop .Format = True End With myrange.Find.Execute If myrange.Find.Found Then report = report & font_name & vbCr End If Next J MsgBox (report) End Sub 
0
source share

This can be faster than converting documents to .docx before processing them using OpenXml (for writing, you can also work with the property.Content.WordOpenXML document instead of document.Content.XML):

 using System; using System.Collections.Generic; using System.IO; using System.Linq; using System.Xml.Linq; using Word = NetOffice.WordApi; namespace _5261108 { class Program { static void Main(string[] args) { using (var app = new Word.Application()) { var doc = app.Documents.Open(Path.GetFullPath("test.docx")); foreach (var font in GetFontNames(doc)) { Console.WriteLine(font); } app.Quit(false); } Console.ReadLine(); } private static IEnumerable<string> GetFontNames(Word.Document document) { var xml = document.Content.XML; var doc = XDocument.Parse(xml); var fonts = doc.Descendants().First(n => n.Name.LocalName == "fonts").Elements().Where(n => n.Name.LocalName == "font"); var fontNames = fonts.Select(f => f.Attributes().First(a => a.Name.LocalName == "name").Value); return fontNames.Distinct(); } } } 

Converted for your convenience:

 Imports System.Collections.Generic Imports System.IO Imports System.Linq Imports System.Xml.Linq Imports Word = NetOffice.WordApi Namespace _5261108 Class Program Private Shared Sub Main(args As String()) Using app = New Word.Application() Dim doc = app.Documents.Open(Path.GetFullPath("test.docx")) For Each font As var In GetFontNames(doc) Console.WriteLine(font) Next app.Quit(False) End Using Console.ReadLine() End Sub Private Shared Function GetFontNames(document As Word.Document) As IEnumerable(Of String) Dim xml = document.Content.XML Dim doc = XDocument.Parse(xml) Dim fonts = doc.Descendants().First(Function(n) n.Name.LocalName = "fonts").Elements().Where(Function(n) n.Name.LocalName = "font") Dim fontNames = fonts.[Select](Function(f) f.Attributes().First(Function(a) a.Name.LocalName = "name").Value) Return fontNames.Distinct() End Function End Class End Namespace '======================================================= 'Service provided by Telerik (www.telerik.com) 'Conversion powered by NRefactory. 'Twitter: @telerik 'Facebook: facebook.com/telerik '======================================================= 
0
source share

All Articles