Python uses lxml with fileinput
Having a simple xml
<?xml version="1.0" encoding="UTF-8" ?>
<root>
<child>abc</child>
</root>
I wanted to parse it from a file, and this works well:
with open('tst.xml') as test_xml:
for _, element in lxml.etree.iterparse(test_xml, tag='child'):
print element.text # prints abc as expected
However, I tried to modify the script to allow it to parse the xml from either the file or from stdinand failed:
fi = fileinput.input('tst.xml')
for _, element in lxml.etree.iterparse(fi, tag='child'):
print element.text
# File "iterparse.pxi", line 371, in lxml.etree.iterparse.__init__ (src/lxml/lxml.etree.c:97283)
# File "apihelpers.pxi", line 1411, in lxml.etree._encodeFilename (src/lxml/lxml.etree.c:22515)
# TypeError: Argument must be string or unicode.
I'm not sure what I'm doing wrong. Is a FileInput object not a file-like object in python?
Without deep research, it seems that the reason for the exception is that the class FileInputdoes not provide a method read. To achieve my goal, I eventually wrote my own shell:
class FileInput(object):
def __init__(self, filename=None, *args, **kwargs):
self.file = open(filename, *args, **kwargs) if filename and filename != "-" else sys.stdin
def __enter__(self):
return self.file
def __exit__(self, type, value, traceback):
if self.file is not sys.stdin:
self.file.close()
def __getattr__(self, name):
return getattr(self.file, name)
I will wait for a better answer though.