Tags with: name in lxml

I am trying to use lxml.etree to parse a Wordpress export document (this is XML, a bit like RSS). I'm only interested in published posts, so I use the following to post published posts:

for item in data.findall("item"):
    if item.find("wp:post_type").text != "post":
        continue
    if item.find("wp:status").text != "publish":
        continue
    write_post(item)

where datais the tag in which the tags are located item. Tags itemcontain posts, pages, and drafts. My problem is that lxml cannot find tags with :in its name (e.g. wp:post_type). When I try item.find("wp:post_type"), I get this error:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "lxml.etree.pyx", line 1279, in lxml.etree._Element.find (src/lxml/lxml.e
tree.c:38124)
  File "/usr/lib64/python2.7/site-packages/lxml/_elementpath.py", line 210, in f
ind
    it = iterfind(elem, path)
  File "/usr/lib64/python2.7/site-packages/lxml/_elementpath.py", line 200, in i
terfind
    selector = _build_path_iterator(path)
  File "/usr/lib64/python2.7/site-packages/lxml/_elementpath.py", line 184, in _
build_path_iterator
    selector.append(ops[token[0]](_next, token))
KeyError: ':'

, KeyError : ':' , . , , lxml ? : - ? - ? .

+5
1

: - XML. lxml, URL- , item.find("{http://example.org/}status").text.

+9

All Articles