Since you used .readline() , first the codecs.open() file codecs.open() with a line buffer; a subsequent call to .readlines() returns only buffer lines.
If you call .readlines() again, the rest of the lines are returned:
>>> f = codecs.open(filename, 'r3', encoding='utf-8') >>> line = f.readline() >>> len(f.readlines()) 7 >>> len(f.readlines()) 71
A workaround should be to not mix .readline() and .readlines() :
f = codecs.open(filename, 'r3', encoding='utf-8') data_f = f.readlines() names_f = data_f.pop(0).split(' ')
This behavior is indeed a mistake; Python developers are aware of this, see issue 8260 .
Another option is to use io.open() instead of codecs.open() ; the io library is what Python 3 uses to implement the built-in open() function and is much more reliable and versatile than the codecs module.
Martijn pieters
source share