Getting text values ​​from XML in Python

from xml.dom.minidom import parseString dom = parseString(data) data = dom.getElementsByTagName('data') 

the variable 'data' is returned as an element object, but I cannot use it in the documentation to get the text value of an element.

For instance:

 <something><data>I WANT THIS</data></something> 

Does anyone have any ideas?

+4
source share
2 answers

This should do the trick:

 dom = parseString('<something><data>I WANT THIS</data></something>') data = dom.getElementsByTagName('data')[0].childNodes[0].data 

i.e. you need to penetrate deeper into the DOM structure in order to get the text child of the node and then access its value.

+3
source

So, the way to look at this is that “I WANT IT” is actually another node. This is a text child of "data."

 from xml.dom.minidom import parseString dom = parseString(data) nodes = dom.getElementsByTagName('data') 

At this point, the "nodes" are a NodeList, and in your example it has one element in it, which is the "data" element. Accordingly, the "data" element also has only one child element, which is the text node "I WANT IT".

So you can just do something like this:

 print nodes[0].firstChild.nodeValue 

Please note that if there is more than one tag called “data” in your input, you should use some kind of iteration method on the “nodes”, and not index it directly.

+4
source

All Articles