I am afraid that you cannot define the delimiter without knowing what it is. The problem with CSV is that quoting ESR :
Microsoft's CSV version is an example tutorial on how not to create a text file format.
The delimiter must be somehow shielded if it can appear in the fields. Not knowing how the escape occurs, it is difficult to automatically identify it. Escaping can be done using the UNIX method using the backslash '\' or the Microsoft path using quotation marks, which must then be escaped. This is not a trivial task.
So my suggestion is to get the full documentation from whoever generates the file you want to convert. Then you can use one of the approaches suggested in other answers or in some variant.
Edit:
Python provides csv.Sniffer to help you determine the format of your DSV. If your input looks like this (note the quoted separator in the first field of the second line):
a|b|c "a|b"|c|d foo|"bar|baz"|qux
You can do it:
import csv csvfile = open("csvfile.csv") dialect = csv.Sniffer().sniff(csvfile.read(1024)) csvfile.seek(0) reader = csv.DictReader(csvfile, dialect=dialect) for row in reader: print row,
lutz
source share