I am following examples from a Python book for data analysis. In particular, the 2012 election database from its chapter 9. The data is in a large CSV file, separated by a comma. But each line of the file has an additional trailing delimiter, which seems to confuse pandas.read_csv .
It handles the extra delimiter as if there is an extra column. So there is one more column than the headers are required. Then pandas.read_csv takes the first column as row labels. The overall effect is that the columns and headers no longer align - the first column becomes row labels, the second column is called the first heading, etc.
This is pretty annoying. Any idea how to say pandas.read_csv is doing the right thing? I could not find him.
The Great Book, BTW.
source share