This is the soup from the WordPress post page:
content = soup.body.find('div', id=re.compile('post')) title = content.h2.extract() item['title'] = unicode(title.string) item['content'] = u''.join(map(unicode, content.contents))
I want to omit the attached div tag when assigning item['content'] . Is there a way to render all child tag tags in Unicode? Sort of:
item['content'] = content.contents.__unicode__()
which will give me one unicode line instead of a list.
muhuk source share