One way to fix this problem is patch libxml2 .
Referring to the source code of libxml2.9.2 (https: // git.gnome.org/browse/libxml2/tree/?id=v2.9.2), in SAX2.c (https://git.gnome.org/browse/libxml2 /tree/SAX2.c? id = v2.9.2) (the internal SAX parser used to create the DOM tree) in the attributes of line 1699 using xmlns are not parsed in HTML mode and they are parsed like any other attributes in the string and 1740. Therefore, it makes sense to adjust line 1622, which splits the name into a prefix and a local part. Change:
name = xmlSplitQName(ctxt, fullname, &prefix);
in
if (!ctxt->html) { name = xmlSplitQName(ctxt, fullname, &prefix); } else { name = xmlStrdup(fullname); prefix = NULL; }
Then libxml2 will consider tags, such as <o:p> , for elements named o:p , that is, a colon is included in the element name without a special value. This is the correct interpretation in HTML. For example, the HTML5 specification says :
In HTML syntax, namespace prefixes and namespace declarations do not have the same effect as in XML. For example, the colon has no special meaning in the names of HTML elements.
We hope that this change will be approved for a future version of libxml2. There is an open bug report (https://bugzilla.gnome.org/show_bug.cgi?id=654146).
Insightfuls
source share