Replacing elements with lxml.html

I am new to lxml and HTML Parsers in general. I was wondering if there is a way to replace an element in the tree with another element ...

For example, I have:

body = """<code> def function(arg): print arg </code> Blah blah blah <code> int main() { return 0; } </code> """ doc = lxml.html.fromstring(body) codeblocks = doc.cssselect('code') for block in codeblocks: lexer = guess_lexer(block.text_content()) hilited = highlight(block.text_content(), lexer, HtmlFormatter()) doc.replace(block, hilited) 

I want to do something along these lines, but this leads to a "TypeError" because "hilited" is not lxml.etree._Element.

Is it possible?

Hi,

+7
python lxml
source share
2 answers

Regarding lxml,

In doc.replace(block, hilited)

a block is an object of the lxml element, hilited is a string, you cannot replace it.

There are two ways to do this.

 block.text=hilited 

or

 body=body.replace(block.text,hilited) 
+4
source share

If you are new to python HTML parsers, you can try BeautifulSoup , an html / xml parser that allows you to easily modify the parse tree .

0
source share

All Articles