First one

CSS selector to get element attribute value

The HTML structure looks like this:

<td class='hey'> <a href="https://example.com">First one</a> </td> 

This is my selector:

 m_URL = sel.css("td.hey a:nth-child(1)[href] ").extract() 

Now my selector outputs <a href="https://example.com">First one</a> , but I want it to output the link itself: https://example.com .

How can i do this?

+7
python css-selectors web-scraping scrapy
source share
2 answers

Get ::attr(value) from the a tag.

Demo (using Scrapy shell ):

 $ scrapy shell index.html >>> response.css('td.hey a:nth-child(1)::attr(href)').extract() [u'https://example.com'] 

where index.html contains:

 <table> <tr> <td class='hey'> <a href="https://example.com">Fist one</a> </td> </tr> </table> 
+11
source share

you can try the following:

 m_URL = sel.css("td.hey a:nth-child(1)").xpath('@href').extract() 
+3
source share

All Articles