How to get all the HTML contained inside a tag?
hxs = HtmlXPathSelector(response) element = hxs.select('//span[@class="title"]/')
Maybe something like:
hxs.select('//span[@class="title"]/html()')
EDIT: If I look at the documentation , I only see methods for returning a new XPathSelectorList or just raw text inside the tag. I want to get not a new list or just text, but the HTML source code inside the tag. eg:
<html> <head> <title></title> </head> <body> <div id="leexample"> justtext <p class="ihatelookingforfeatures"> sometext </p> <p class="yahc"> sometext </p> </div> <div id="lenot"> blabla </div> an awfuly long example for this. </body> </html>
I want to create a method such as hxs.select('//div[@id="leexample"]/html()') which should return an hxs.select('//div[@id="leexample"]/html()') inside, like so:
justtext <p class="ihatelookingforfeatures"> sometext </p> <p class="yahc"> sometext </p>
I hope to clarify the ambiguity around my question.
How to get HTML- HtmlXPathSelector from HtmlXPathSelector in Scrapy? (maybe the solution goes beyond the scope?)
source share