Read_fwf in pandas in Python does not use a comment character if colspecs argument does not include the first column

When reading fixed-width files using the read_fwf function in pandas (0.18.1) with Python (3.4.3), you can specify a comment character using the comment argument. I expected all lines starting with a comment character would be ignored. However, if you do not specify the first column in the file in any column in colspecs , the comment character will not be used.

 import io, sys import pandas as pd sys.version # '3.4.3 (v3.4.3:9b73f1c3e601, Feb 24 2015, 22:43:06) [MSC v.1600 32 bit (Intel)]' pd.__version__ # '0.18.1' # Two input files, first line is comment, second line is data. # Second file has a column (with the letter A) # that I don't want at start of data. string = "#\n1K\n" off_string = "#\nA1K\n" # When using skiprows to skip commented row, both work. pd.read_fwf(io.StringIO(string), colspecs = [(0,1), (1,2)], skiprows = 1, header = None) # 0 1 # 0 1 K pd.read_fwf(io.StringIO(off_string), colspecs = [(1,2), (2,3)], skiprows = 1, header = None) # 0 1 # 0 1 K # If a comment character is specified, it only works when the colspecs # includes the column with the comment character. pd.read_fwf(io.StringIO(string), colspecs = [(0,1), (1,2)], comment = '#', header = None) # 0 1 # 0 1 K pd.read_fwf(io.StringIO(off_string), colspecs = [(1,2), (2,3)], comment = '#', header = None) # 0 1 # 0 NaN NaN # 1 1.0 K 

Is there any documentation specifically referencing this? A simple workaround is to include the first column and then delete it after, but I wanted to check if this was a mistake or a misunderstanding of the expected behavior.

+5
source share
1 answer

I think this is a mistake, the specification in the documentation says: "If the line starts with a comment, then the whole line is skipped." The problem is that the columns fall under FixedWidthReader.__next__ before they are marked for comment (in PythonParser or CParserWrapper ). The corresponding code is in io/parsers.py .

+5
source

All Articles