Ruby CSV doesn't understand \ r \ n as end of line

I use an iPhone application that periodically sends me an email in CSV format. I have a ruby ​​script that summarizes the data in this log with old logs. Recently, the application developer published an update that, for some unknown reason, added a carriage return to the end of each line, resulting in my script failing. According to the docs :row_end default :row_end should be :auto , which should accept either \r\n or \n (in 1.9.2). I tried using Ruby 1.8.7, 1.9.2 and FasterCSV with 1.8.7. I get various error messages with these different attempts, including

  • CSV::IllegalFormatError
  • Unquoted fields do not allow \r or \n (line 1) ( FasterCSV::MalformedCSVError )
  • cannot duplicate NilClass (TypeError)

in 1.9.2. ( \r not in the field, this is the end of the line!) The data used to look like this:

 03-12-2012 07:59,120.0, 03-11-2012 08:27,120.0, 03-10-2012 07:57,120.0, 

Now it looks like this:

 03-12-2012 07:59,120.0,^M 03-11-2012 08:27,120.0,^M 03-10-2012 07:57,120.0,^M 

Thinking that CSV might think ^M is in the last field, I tried adding another comma:

 03-12-2012 07:59,120.0,,^M 

to no avail.

The only thing I can imagine is that CSV requires all fields to be in double quotes? I can come up with various workarounds, for example, first read the file, grind the ends, and then process the array using CSV, but first I want to find out what I'm doing wrong. It seems like it should work.

By the way, my code is just:

 CSV.foreach(File.join($import_dir, file)) do |record| 

and I tried installing :row_end => "\r\n" no avail.

I am on Mac OS X 10.6.8.

+7
source share
5 answers

Works for me in 1.9.3:

 mark@ubuntu :~$ irb 1.9.3p0 :001 > require 'csv' => true 1.9.3p0 :002 > CSV.foreach("rn.csv") do |row| 1.9.3p0 :003 > p row 1.9.3p0 :004 > end ["1","2","3","4","5"] ["6","7","8","9","10"] 

And the file does have a carriage return in it:

 mark@ubuntu :~$ od -a rn.csv 0000000 1 , 2 , 3 , 4 , 5 cr nl 6 , 7 , 8 0000020 , 9 , 1 0 cr nl 0000027 
+3
source

Since the CSV has to read / parse the whole file when row_end is automatic, I needed to do the following to prevent formatting and coding exceptions.

  • Decode a file through File.read
  • Remove excess carriages (maybe one or more)
  • Parse the cleared file as a CSV
 file = File.read(temp_file.path, encoding: 'ISO-8859-1:UTF-8') file = file.tr("\r", '') CSV.parse(file, headers: true) do |row| # do all the things end 

Note. I am using Ruby version 2.1.3 for a Rails 4 application.

+6
source

Try setting row_end to

 "\r\n" 

This is different from '\ r \ n': single quotes only allow you to escape 'and \, any thing is treated as a literal \, i.e.

 '\r' == "\\r" 

Is true

+5
source

You mentioned the attempt :row_end => '\r\n' . Single quotes handle (most instances) the backslash as regular backslash characters; try :row_end => "\r\n" with double quotes.

+3
source

The lines of the file actually end with \ r \ r \ n, not \ r \ n. This is awkward, I had to check the file in more detail. I assumed that the end of the line was \ n since I am in a Unix window. But when Emacs opened the file, it automatically went into "DOS" mode, so it displayed as a new line and only displayed "r" as "^ M"

+2
source