How to speed up DTD loading through DOCTYPE

I need to upload some xhtml files having this at the top:

<?xml version="1.0" encoding="UTF-8"?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> 

Each file will be loaded into a separate System.Xml.XmlDocument file. Due to the DOCTYPE declaration, they take a very long time to load. I tried setting XmlResolver = null, but then I get an XmlException because I have invalid objects (like "). So I thought I could only load the DTD for the first XmlDocument and somehow reuse it for subsequent XmlDocuments (and thus avoid a performance hit), but I have no idea how to do this.

I am using .Net 3.5.

Thanks.

+6
c # xml
source share
2 answers

I think you should solve this problem with XmlPreloadedResolver . However, I have some difficulties with his work. XHTML 1.0 seems to be easier to maintain since it is a “famous” DTD: XmlKnownDtds , while XHTML 1.1 is not currently “known”, which means you have to reload a bunch of URIs.

For example:

 XmlPreloadedResolver xmlPreloadedResolver = new XmlPreloadedResolver(XmlKnownDtds.Xhtml10); xmlPreloadedResolver.Add(new Uri("http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"), File.ReadAllBytes("D:\\xhtml11.dtd")); xmlPreloadedResolver.Add(new Uri("http://www.w3.org/TR/xhtml-modularization/DTD/xhtml-inlstyle-1.mod"), File.ReadAllBytes("D:\\xhtml-inlstyle-1.mod")); xmlPreloadedResolver.Add(new Uri("http://www.w3.org/TR/xhtml-modularization/DTD/xhtml-framework-1.mod"), File.ReadAllBytes("D:\\xhtml-framework-1.mod")); xmlPreloadedResolver.Add(new Uri("http://www.w3.org/TR/xhtml-modularization/DTD/xhtml-text-1.mod"), File.ReadAllBytes("D:\\xhtml-text-1.mod")); xmlPreloadedResolver.Add(new Uri("http://www.w3.org/TR/xhtml-modularization/DTD/xhtml-hypertext-1.mod"), File.ReadAllBytes("D:\\xhtml-hypertext-1.mod")); xmlPreloadedResolver.Add(new Uri("http://www.w3.org/TR/xhtml-modularization/DTD/xhtml-list-1.mod"), File.ReadAllBytes("D:\\xhtml-list-1.mod")); xmlPreloadedResolver.Add(new Uri("http://www.w3.org/TR/xhtml-modularization/DTD/xhtml-edit-1.mod"), File.ReadAllBytes("D:\\xhtml-edit-1.mod")); xmlPreloadedResolver.Add(new Uri("http://www.w3.org/TR/xhtml-modularization/DTD/xhtml-bdo-1.mod"), File.ReadAllBytes("D:\\xhtml-bdo-1.mod")); xmlPreloadedResolver.Add(new Uri("http://www.w3.org/TR/ruby/xhtml-ruby-1.mod"), File.ReadAllBytes("D:\\xhtml-ruby-1.mod")); xmlPreloadedResolver.Add(new Uri("http://www.w3.org/TR/xhtml-modularization/DTD/xhtml-pres-1.mod"), File.ReadAllBytes("D:\\xhtml-pres-1.mod")); // TODO: Add other modules here (see the xhtml11.dtd for the full list) XmlDocument xmlDocument = new XmlDocument(); xmlDocument.XmlResolver = xmlPreloadedResolver; xmlDocument.Load("D:\\doc1.xml"); 
+4
source share

For the .NET Framework 3.5 and below, it might have been possible to use XmlUrlResolver , as shown in this answer . However, this approach downloads DTDs from the W3C website at run time, which is not a good idea, not least because the W3C seems to currently block such requests. another answer involves caching DTDs as embedded resources in an assembly, like your HTML2XHTML .

For other readers using the .NET Framework 4.0 and above, you can use XmlPreloadedResolver as suggested by Daniel Renshaw , who supports XHTML 1.0. To support XHTML 1.1, you can simplify your implementation by using the smoothed version of DTD available at xhtml11-flat.dtd on the W3C website. For this purpose, I define an extension method:

 public static class XmlPreloadedResolverExtensions { private const string Xhtml11DtdPublicId = "-//W3C//DTD XHTML 1.1//EN"; private const string Xhtml11DtdSystemId = "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"; public static void AddXhtml11(this XmlPreloadedResolver resolver, bool @override = false) { Add(resolver, new Uri(Xhtml11DtdPublicId, UriKind.RelativeOrAbsolute), ManifestResources.xhtml11_flat_dtd, @override); Add(resolver, new Uri(Xhtml11DtdSystemId, UriKind.RelativeOrAbsolute), ManifestResources.xhtml11_flat_dtd, @override); } public static bool Add(this XmlPreloadedResolver resolver, Uri uri, Stream value, bool @override) { if (@override || !resolver.PreloadedUris.Contains(uri)) { resolver.Add(uri, value); return true; } return false; } } 

Then it can be used as regular XmlResolver instances:

 var xmlResolver = new XmlPreloadedResolver(); xmlResolver.AddXhtml11(); XmlReaderSettings settings = new XmlReaderSettings(); settings.DtdProcessing = DtdProcessing.Parse; settings.XmlResolver = xmlResolver; XDocument document; using (var xmlReader = XmlReader.Create(input, settings)) document = XDocument.Load(xmlReader); 
+1
source share

All Articles