Using a DOMDocument to Extract from an HTML Document by Class

In the DOMDocument class, there are methods for getting elements by id and by tag name (getElementById and getElementsByTagName), but not by class. Is there any way to do this?

As an example, how would I select a div from the following markup?

<html>
...
<body>
...
<div class="foo">
...
</div>
...
</body>
</html>
+5
source share
3 answers

The simple answer is to use xpath:

$dom = new DomDocument();
$dom->loadHtml($html);
$xpath = new DomXpath($dom);
$div = $xpath->query('//*[@class="foo"]')->item(0);

But it will not accept spaces. Therefore, to select a class, separated by a space, use this query:

//*[contains(concat(' ', normalize-space(@class), ' '), ' class ')
+11
source
$html = '<html><body><div class="foo">Test</div><div class="foo">ABC</div><div class="foo">Exit</div><div class="bar"></div></body></html>';

$dom = new DOMDocument();
@$dom->loadHtml($html);

$xpath = new DOMXPath($dom);

$allClass = $xpath->query("//@class");
$allClassBar = $xpath->query("//*[@class='bar']");

echo "There are " . $allClass->length . " with a class attribute<br>";

echo "There are " . $allClassBar->length . " with a class attribute of 'bar'<br>";
+2
source

ircmaxell, , :

$dom = new DomDocument();
$dom->loadHtml($html);
$xpath = new DomXpath($dom);
$classname='foo';
$div = $xpath->query("//table[contains(@class, '$classname')]")->item(0);
0

All Articles