How can I remove namespaces from the lxml tree?

After Removing child elements in XML using python ...

Thanks @Tichodroma, I have this code:

If you can use lxml , try the following:

import lxml.etree tree = lxml.etree.parse("leg.xml") for dog in tree.xpath("//Leg1:Dog", namespaces={"Leg1": "http://what.not"}): parent = dog.xpath("..")[0] parent.remove(dog) parent.text = None tree.write("leg.out.xml") 

Now leg.out.xml looks like this:

  <?xml version="1.0"?> <Leg1:MOR xmlns:Leg1="http://what.not" oCount="7"> <Leg1:Order> <Leg1:CTemp id="FO"> <Leg1:Group bNum="001" cCount="4"/> <Leg1:Group bNum="002" cCount="4"/> </Leg1:CTemp> <Leg1:CTemp id="GO"> <Leg1:Group bNum="001" cCount="4"/> <Leg1:Group bNum="002" cCount="4"/> </Leg1:CTemp> </Leg1:Order> </Leg1:MOR> 

How do I change my code to remove the Leg1: namespace Leg1: from all element tag names?

+5
source share
2 answers

One possible way to remove a namespace prefix from each element is:

 def strip_ns_prefix(tree): #iterate through only element nodes (skip comment node, text node, etc) : for element in tree.xpath('descendant-or-self::*'): #if element has prefix... if element.prefix: #replace element name with its local name element.tag = etree.QName(element).localname return tree 

Another version that checks the namespace in xpath instead of using the if :

 def strip_ns_prefix(tree): #xpath query for selecting all element nodes in namespace query = "descendant-or-self::*[namespace-uri()!='']" #for each element returned by the above xpath query... for element in tree.xpath(query): #replace element name with its local name element.tag = etree.QName(element).localname return tree 
+7
source

The following function can be used to separate namespaces from the lxml tree:

 def strip_ns(tree): for node in tree.iter(): try: has_namespace = node.tag.startswith('{') except AttributeError: continue # node.tag is not a string (node is a comment or similar) if has_namespace: node.tag = node.tag.split('}', 1)[1] 
+2
source

All Articles