Python way to read CSV with row and column headers

Question

Python way to read CSV with row and column headers

Suppose there is a CSV table with row and column headings, for example:

, "Car", "Bike", "Boat", "Plane", "Shuttle" "Red", 1, 7, 3, 0, 0 "Green", 5, 0, 0, 0, 0 "Blue", 1, 1, 4, 0, 1

I want to get row and column headers, i.e.:

 col_headers = ["Car", "Bike", "Boat", "Plane", "Shuttle"] row_headers = ["Red", "Green", "Blue"] data = [[1, 7, 3, 0, 0], [5, 0, 0, 0, 0], [1, 1, 4, 0, 1]]

Of course I can do something like

 import csv with open("path/to/file.csv", "r") as f: csvraw = list(csv.reader(f)) col_headers = csvraw[1][1:] row_headers = [row[0] for row in csvraw[1:]] data = [row[1:] for row in csvraw[1:]]

... but he doesn't look Pythonic enough.

Is there a cleaner way for this natural operation?

+7

python csv

Piotr migdal Nov 10 '12 at 18:36

source share

5 answers

Gareth latty · Answer 1 · 2012-11-10T18:38:53+0000

Take a look at csv.DictReader .

If fieldnames is omitted, the values in the first line of csvfile will be used as field names.

Then you can just do reader.fieldnames . This, of course, only gives column headers. You still have to manually parse the row headers.

I think your original solution is pretty good.

Piotr migdal · Answer 2 · 2013-06-15T12:33:07+0000

Now I see that what I want is the easiest (and most reliable) to execute Pandas .

 import pandas as pd df = pd.read_csv('foo.csv', index_col=0)

And if I want, it is easy to extract it:

 col_headers = list(df.columns) row_headers = list(df.index)

Otherwise, in raw Python, it seems that the method I wrote in the question is "good enough."

Davoud taghawi-nejad · Answer 3 · 2012-11-10T20:17:41+0000

I know that this solution gives you a different output format than requested, but it is very convenient. This reads the csv line in the dictionary:

 reader = csv.reader(open(parameters_file), dialect) keys = [key.lower() for key in reader.next()] for line in reader: parameter = dict(zip(keys, cells))

Jon clements · Answer 4 · 2013-06-15T13:58:42+0000

Without third-party libraries (and if you can live with results that are tuples from zip op):

 import csv with open('your_csv_file') as fin: csvin = csv.reader(fin, skipinitialspace=True) col_header = next(csvin, [])[1:] row_header, data = zip(*((row[0], row[1:]) for row in csvin))

Gives you for col_header , row_header and data :

 ['Bike', 'Boat', 'Plane', 'Shuttle'] ('Red', 'Green', 'Blue') (['1', '7', '3', '0', '0'], ['5', '0', '0', '0', '0'], ['1', '1', '4', '0', '1'])

i love mistaking · Answer 5 · 2017-07-05T12:07:10+0000

Agree, pandas is the best I have found. I am interested in reading certain values of my frame. Here is what I did:

 import pandas as pd d=pd.read_csv(pathToFile+"easyEx.csv") print(d) print(d.index.values) print(d.index.values[2]) print(d.columns.values) print(d.columns.values[2]) print(pd.DataFrame(d,index=['Blue'],columns=['Boat'])+0.333)

And this is what it returns:

  Car Bike Boat Plane Shuttle Red 1 7 3 0 0 Green 5 0 0 0 0 Blue 1 1 4 0 1 ['Red' 'Green' 'Blue'] Blue ['Car' 'Bike' 'Boat' 'Plane' 'Shuttle'] Boat Boat Blue 4.333

Please note that I can check the names of the rows with the names "index" and "column". Also note that I can read a specific database item “dataframe” on its row and column names and that the values are still numeric, so I added “+0.333” to the last print.

I ran the data file, I removed the quote characters ("") and the spaces after the commas in the first line. Here you have the easyEx.csv file:

 Car,Bike,Boat,Plane,Shuttle Red, 1, 7, 3, 0, 0 Green, 5, 0, 0, 0, 0 Blue, 1, 1, 4, 0, 1

Hope this helps =)

Python way to read CSV with row and column headers

More articles: