Since you have not provided any sample HTML or the desired output, here is a general solution:
You can select SGML comments in XPath using the comment() node test; you can remove them from the document by calling .remove on all comment nodes. Illustrated:
require 'nokogiri' doc = Nokogiri.XML('<r><b>hello</b> world</r>') p doc.inner_html #=> "<b>hello</b> world" doc.xpath('//comment()').remove p doc.inner_html #=> "<b>hello</b> world"
Please note that the above modifies the document to remove comments. If you want the original document not to be modified, you can also do this:
class Nokogiri::XML::Node def inner_html_reject(xpath='.//comment()') dup.tap{ |shadow| shadow.xpath(xpath).remove }.inner_html end end doc = Nokogiri.XML('<r><b>hello</b> world</r>') p doc.inner_html_reject #=> "<r><b>hello</b> world</r>" p doc.inner_html #=> "<r><b>hello</b> world</r>"
Finally, note that if you don't need markup, just a text request itself does not include HTML comments:
p doc.text
Phrogz
source share