How to read weird csv files in Pandas?

I would like to read an example csv file shown below

-------------- |A|B|C| -------------- |1|2|3| -------------- |4|5|6| -------------- |7|8|9| -------------- 

I tried

 pd.read_csv("sample.csv",sep="|") 

But that did not work.

How can I read this csv?

+5
source share
3 answers

You can add a comment parameter to read_csv and then delete the columns using NaN dropna :

 import pandas as pd import io temp=u"""-------------- |A|B|C| -------------- |1|2|3| -------------- |4|5|6| -------------- |7|8|9| --------------""" #after testing replace io.StringIO(temp) to filename df = pd.read_csv(io.StringIO(temp), sep="|", comment='-').dropna(axis=1, how='all') print (df) ABC 0 1 2 3 1 4 5 6 2 7 8 9 

More general solution:

 import pandas as pd import io temp=u"""-------------- |A|B|C| -------------- |1|2|3| -------------- |4|5|6| -------------- |7|8|9| --------------""" #after testing replace io.StringIO(temp) to filename #separator is char which is NOT in csv df = pd.read_csv(io.StringIO(temp), sep="^", comment='-') #remove first and last | in data and in column names df.iloc[:,0] = df.iloc[:,0].str.strip('|') df.columns = df.columns.str.strip('|') #split column names cols = df.columns.str.split('|')[0] #split data df = df.iloc[:,0].str.split('|', expand=True) df.columns = cols print (df) ABC 0 1 2 3 1 4 5 6 2 7 8 9 
+11
source

Try import csv instead of using pandas directly.

 import csv easy_csv = [] with open('sample.csv', 'rb') as csvfile: test = csv.reader(csvfile, delimiter=' ', quotechar='|') for row in test: row_preprocessed = """ handling rows at here; removing |, ignoring row that has ----""" easy_csv.append([row_preprocessed]) 

After this preprocessing, you can save it in comma-separated csv files to easily process on pandas.

+1
source

I try this code and its ok !:

 import pandas as pd import numpy as np a = pd.read_csv("a.csv",sep="|") print(a) for i in a: print(i) 

enter image description here

0
source

All Articles