How to read CSV string using?

The trivial CSV line can be used using the line split function. But some lines may have, " for example.

 "good,morning", 100, 300, "1998,5,3" 

therefore, direct use of line splitting will not solve the problem.

My solution is to first split the line with,, and then concatenate the lines with " at the beginning or end of the line.

What is the best practice for this problem?

I am wondering if there is a Python or F # code snippet for this.

EDIT: I'm more interested in the implementation detail, rather than using the library.

+5
python csv
source share
4 answers

There is a csv module in Python that handles this.

Change This task belongs to the "build lexer" category. The standard way to accomplish such tasks is to create a state machine (or use the lexer library / framework that will do this for you.)

The state machine for this task is likely to need only two states:

  • The source, where it reads each character, except for a comma and a new line, as part of a field (exception: start and end spaces), a comma as a field separator, a new line as a record separator. When he meets the initial quote, it goes into
  • read-quoted-field state, where each character (including a comma and a new line), excluding a quote, is considered as part of the field, a quote not accompanied by a quote means the end of the read-quote field (back to the initial state), a quote accompanied by a quote, treated as a single quote (escaped quote).

By the way, your concatenation solution will break down to "Field1","Field2" or "Field1"",""Field2" .

+9
source share

From python CSV module :

reading a regular CSV file:

 import csv reader = csv.reader(open("some.csv", "rb")) for row in reader: print row 

Reading a file with an alternative format:

 import csv reader = csv.reader(open("passwd", "rb"), delimiter=':', quoting=csv.QUOTE_NONE) for row in reader: print row 

There are some good use cases at LinuxJournal.com .

If you are interested in detailed information, read " separate the line with commas, using quotation marks when the line is not in csv format ", showing some nice regexen related to this problem, or just read the source of the csv module.

+3
source share

Chapter 4, “Programming Practices,” gives both C and C ++ implementations of the CSV parser.

+1
source share

The general implementation detail will look something like this (untested)

 def csvline2fields(line): fields = [] quote = None while line.strip(): line = line.strip() if line[0] in ("'", '"'): # Find the next quote: end = line.find(line[0]) fields.append(line[1:end]) # Find the beginning of the next field next = line.find(SEPARATOR) if next == -1: break line = line[next+1:] continue # find the next separator: next = line.find(SEPARATOR) fields.append(line[0:next]) line = line[next+1:] 
+1
source share

All Articles