I have a csv file that I read in a dataframe using the pandas API. I intend to set my own title instead of the first line by default. (I also get rid of some lines). How can I achieve this?
I tried the following, but this did not work as expected:
header_row=['col1','col2','col3','col4', 'col1', 'col2']
This gives the following error:
File "third_party/py/pandas/io/parsers.py", line 187, in read_csv File "third_party/py/pandas/io/parsers.py", line 160, in _read File "third_party/py/pandas/io/parsers.py", line 628, in get_chunk File "third_party/py/pandas/core/frame.py", line 302, in __init__ File "third_party/py/pandas/core/frame.py", line 388, in _init_dict File "third_party/py/pandas/core/internals.py", line 1008, in form_blocks File "third_party/py/pandas/core/internals.py", line 1036, in _simple_blockify File "third_party/py/pandas/core/internals.py", line 1068, in _stack_dict IndexError: index out of bounds
Then I tried setting the columns through
df.columns = header_row
But this error appeared, probably due to duplicate column values.
File "engines.pyx", line 101, in pandas._engines.DictIndexEngine.get_loc (third_party/py/pandas/src/engines.c:2498) File "engines.pyx", line 107, in pandas._engines.DictIndexEngine.get_loc (third_party/py/pandas/src/engines.c:2447) Exception: ('Index values are not unique', 'occurred at index entity')
I am using pandas version 0.7.3. From the documentation -
names: array-like List of column names
I am sure I am missing something simple here. Thanks for any help here.