Why should I give a `savetxt` file opened in binary rather than text mode?

I was bitten by the following numpy behavior:

 In [234]: savetxt(open('/tmp/a.dat', 'wt'), array([1, 2, 3])) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-234-2adef92da877> in <module>() ----> 1 savetxt(open('/tmp/a.dat', 'wt'), array([1, 2, 3])) /local/gerrit/python3.2/lib/python3.2/site-packages/numpy/lib/npyio.py in savetxt(fname, X, fmt, delimiter, newline) 1007 else: 1008 for row in X: -> 1009 fh.write(asbytes(format % tuple(row) + newline)) 1010 finally: 1011 if own_fh: TypeError: must be str, not bytes In [235]: savetxt(open('/tmp/a.dat', 'wb'), array([1, 2, 3])) # success 

I find it strange. I am trying to save my array in a text file. Then why should I open the file in binary mode?

+4
source share
2 answers

Because your data is bytes (i.e. binary) data.

What comes out is a text file. Donโ€™t worry. :-) A โ€œtextโ€ file is defined by the fact that it contains only text that is read by a person, and not the one in which mode you open it. The mode simply affects how it processes the data.

Text mode means that it expects Unicode data and it will encode it in byte format for you. Binary mode means that it expects data in bytes and will not encode it.

+4
source

Most likely, because the supporting numpy files did not update this function for full compatibility with python 3. The name "savetxt", of course, means that the text file will be sufficient, and there is nothing stopping them from calling fh.write ((format% tuple (string) + new line) .encode ()).

There is nothing wrong with using binary mode, except in some cases it is surprising how you discovered. I consider this a mistake in api design, if nothing else.

+1
source

All Articles