Getting text values from XML in Python

Question

Getting text values from XML in Python

from xml.dom.minidom import parseString dom = parseString(data) data = dom.getElementsByTagName('data')

the variable 'data' is returned as an element object, but I cannot use it in the documentation to get the text value of an element.

For instance:

 <something><data>I WANT THIS</data></something>

Does anyone have any ideas?

+4

python xml parsing

Mark sanborn Sep 16 '09 at 16:02

source share

2 answers

So, the way to look at this is that “I WANT IT” is actually another node. This is a text child of "data."

 from xml.dom.minidom import parseString dom = parseString(data) nodes = dom.getElementsByTagName('data')

At this point, the "nodes" are a NodeList, and in your example it has one element in it, which is the "data" element. Accordingly, the "data" element also has only one child element, which is the text node "I WANT IT".

So you can just do something like this:

 print nodes[0].firstChild.nodeValue

Please note that if there is more than one tag called “data” in your input, you should use some kind of iteration method on the “nodes”, and not index it directly.

+4

Brent nash Sep 16 '09 at 16:10

source share

Andy · Accepted Answer · 2009-09-16T16:09:13+0000

This should do the trick:

 dom = parseString('<something><data>I WANT THIS</data></something>') data = dom.getElementsByTagName('data')[0].childNodes[0].data

i.e. you need to penetrate deeper into the DOM structure in order to get the text child of the node and then access its value.

Getting text values ​​from XML in Python

More articles:

Getting text values from XML in Python