How to parse a CSV file in Bash?

I am working on a long Bash script. I want to read cells from a CSV file into Bash variables. I can parse the rows and the first column, but not any other column. Here is my code:

cat myfile.csv|while read line do read -d, col1 col2 < <(echo $line) echo "I got:$col1|$col2" done 

It prints only the first column. As an additional test, I tried the following:

read -d, xy < <(echo a,b,)

And $ y is empty. So I tried:

read xy < <(echo ab)

And $ y - b . Why?

+96
linux bash csv
Nov 26 '10 at 15:20
source share
3 answers

You need to use IFS instead of -d :

 while IFS=, read -r col1 col2 do echo "I got:$col1|$col2" done < myfile.csv 

Note that for general purpose CSV parsing, you should use a specialized tool that can handle fields with internal comma codes, among other problems that Bash cannot handle on its own. Examples of such tools are cvstool and csvkit .

+186
Nov 26 '10 at 16:09
source share

On the man page:

-d delim The first delim character is used to complete a line entry, instead of a new line.

You use -d, which will complete the input of the string with a comma. It will not read the rest of the line. Therefore, $ y is empty.

+9
Nov 26 '10 at 15:35
source share

We can parse CSV files with quoted strings and separated by say | with the following code

 while read -r line do field1=$(echo $line | awk -F'|' '{printf "%s", $1}' | tr -d '"') field2=$(echo $line | awk -F'|' '{printf "%s", $2}' | tr -d '"') echo $field1 $field2 done < $csvFile 

awk parses string fields into variables and tr removes quotation marks.

A bit slower as awk runs for each field.

+1
Jan 25 '19 at 8:24
source share



All Articles