Download HTML code containing namespaces with DOMDocument

I have a problem. I want to load an HTML fragment with a namespace with DOMDocument.

<div class="something-first">
    <div class="something-child something-good another something-great">
        <my:text value="huhu">
    </div>
</div>

But I can't figure out how to save namespaces. I tried loading it with loadHTML(), but HTML does not have namespaces and therefore they are separated.

I tried to download it using loadXML(), but this will not work, as the reason is <my:text value="huhu">not XML compliant.

I need a method loadHTML()that does not share namespaces or a method loadXML()that does not check markup. So, a combination of these two methods.

My code is:

$html = '<div class="something-first">
    <div class="something-child something-good another something-great">
        <my:text value="huhu">
    </div>
</div>';

libxml_use_internal_errors(true);

$domDoc = new DOMDocument();
$domDoc->formatOutput = false;
$domDoc->resolveExternals = false;
$domDoc->substituteEntities = false;
$domDoc->strictErrorChecking = false;
$domDoc->validateOnParse = false;

$domDoc->loadHTML($html/*, LIBXML_NOERROR | LIBXML_NOWARNING*/);
$xpath = new DOMXPath($domDoc);
$xpath->registerNamespace ( 'my', 'http://www.example.com/' );

// -----> This results in zero nodes cause namespace gets stripped by loadHTML()
$nodes = $xpath->query('//my:*');
var_dump($nodes);

Is there any way to achieve what I want? I would be very happy for any advice.

EDIT. libxml2, HTML: https://bugzilla.gnome.org/show_bug.cgi?id=711670

+1
2

-, XML ( XHTML). HTML .


, XHTML xmlns, , DOMDocument::getElementsByTagNameNS():

$html = <<<EOF
<div xmlns:my="http://www.example.com/" class="something-first">
    <div class="something-child something-good another something-great">
        <my:text value="huhu" />
    </div>
</div>
EOF;

$domDoc = new DOMDocument();
$domDoc->loadXML($html);
var_dump(
  // it is possible to use wildcard `*` here
  $domDoc->getElementsByTagNameNS('http://www.example.com/', '*')
);

, , <html>, , .

, , , , ... ( )


, XML/XHTML. - , HTML. ( )

+2

XML HTML ( XHTML), HTML , XML , . , : " DOMDocument HTML XML, XML?" , , libxml, ? , :

$html = <<<XML
<div xmlns:my="http://www.example.com/" class="something-first">
    <div class="something-child something-good another something-great">
        <my:text value="huhu" />
    </div>
</div>
XML;

NS my:text, :

$domDoc = new DOMDocument();
$domDoc->loadXML($html);
echo $domDoc->saveXML();

, . , , XML HTML. XPath , xmlns .

, : XML, ? , , ?

+2

All Articles