Select HTML elements by the name of their first child

I need to find the value if the attribute of idall <div>elements having a child element <span>.

For example, given this HTML

<div id="a1">                 <span> xa1 </span>       </div>
<div id="a2"> <p>...</p>      <span> xa2 </span>       </div>
<div id="a3">            <p>  <span> xa3 </span> </p>  </div>
<div id="a4"> <p>...</p>                             </div>

<div id="b1"> </div>          <span> xb1 </span>
<div id="b2"> </div> <p>      <span> xb1 </span> </p>
<div id="b3"> </div> <p>.</p> <span> xb3 </span>

I need to get: a1and nothing more.

Since the CSS selector has nothing like a positive look, I need to look for HTML code several times, but I don’t know how to do it.

How to change the following source to get only a1?

use 5.014;
use warnings;

use Mojo::DOM;

my $html = do {local $/; <DATA>};

my $dom = Mojo::DOM->new($html);

for my $div ($dom->find('div')->each) {
   #say "DIV[[$div]]";
   my @spans = $div->find('div > span')->each;   #found a1 and a2 ;(
   say $div->attr('id') if (@spans == 1);
}

__DATA__
<div id="a1">                 <span> xa1 </span>       </div>
<div id="a2"> <p>...</p>      <span> xa2 </span>       </div>
<div id="a3">            <p>  <span> xa3 </span> </p>  </div>
<div id="a4"> <p>...</p>                             </div>

<div id="b1"> </div>          <span> xb1 </span>
<div id="b2"> </div> <p>      <span> xb1 </span> </p>
<div id="b3"> </div> <p>.</p> <span> xb3 </span>

<p id="p1">                <span> xp1 </span>       </p>
<p id="p2"> <p>...</p>     <span> xp2 </span>       </p>
<p id="p3">            <p> <span> xp3 </span> </p>  </p>
<p id="p4"> <p>...</p>                              </p>
+4
source share
3 answers

You can get the element you are looking for in a slightly circular way using css-style selectors and the Mojo :: DOM parentmethod:

use strict;
use warnings;
use feature ":5.10";
use Mojo::DOM;

my $html = do{ local $/; <DATA>};

my $dom = Mojo::DOM->new($html);

# searches for div elements with spans as the first child
for my $div ( $dom->find('div > span:first-child')->parent->each ) {
    say "id: " . $div->attr('id') if $div->attr('id');
}

__DATA__
<div id="a1">                 <span> xa1 </span>       </div>
<div id="a2"> <p>...</p>      <span> xa2 </span>       </div>
<div id="a3">            <p>  <span> xa3 </span> </p>  </div>
<div id="a4"> <p>...</p>                             </div>

<div id="b1"> </div>          <span> xb1 </span>
<div id="b2"> </div> <p>      <span> xb1 </span> </p>
<div id="b3"> </div> <p>.</p> <span> xb3 </span>

<p id="p1">                <span> xp1 </span>       </p>
<p id="p2"> <p>...</p>     <span> xp2 </span>       </p>
<p id="p3">            <p> <span> xp3 </span> </p>  </p>
<p id="p4"> <p>...</p>                              </p>

Conclusion:

id: a1

, , div, , :

say "id: " . $dom->at('div > span:first-child')->parent->attr('id');
+3

, Mojo::DOM XPath, CSS, .

, HTML::TreeBuilder::XPath. . XPath

//div[*][local-name(*[1])="span"]/@id

id div , , - span.

use strict;
use warnings;
use 5.014;

use HTML::TreeBuilder::XPath;

my $tree = do {
   local $/;
   HTML::TreeBuilder::XPath->new_from_content(<DATA>);
};

say for $tree->findvalues('//div[*][local-name(*[1])="span"]/@id');

__DATA__
<html><body>
<div id="a1">                 <span> xa1 </span>       </div>
<div id="a2"> <p>...</p>      <span> xa2 </span>       </div>
<div id="a3">            <p>  <span> xa3 </span> </p>  </div>
<div id="a4"> <p>...</p>                             </div>

<div id="b1"> </div>          <span> xb1 </span>
<div id="b2"> </div> <p>      <span> xb1 </span> </p>
<div id="b3"> </div> <p>.</p> <span> xb3 </span>

<p id="p1">                <span> xp1 </span>       </p>
<p id="p2"> <p>...</p>     <span> xp2 </span>       </p>
<p id="p3">            <p> <span> xp3 </span> </p>  </p>
<p id="p4"> <p>...</p>                              </p>
</body></html>

a1
+3

Or this:

my @spans = $div->find('div > span:first-child')->each;
say $div->attr('id') if (@spans == 1);

Or that:

my @kids = $div->children;
say $div->attr('id') if @kids and $kids[0]->type eq 'span';
0
source

All Articles