Generic newline support in Ruby, which includes line ending \ r (CR)

In a Rails application, I accept and process CSV files that can be formatted with any of the three possible line termination characters: \n ( LF ), \r\n ( CR+LF ), or \r ( CR ). The Ruby File and CSV libraries seem to do a great job with the first two cases, but the last case (line ending "Mac classic" \r ) is not treated as a new line. It is important to be able to accept this format, as well as others, since Microsoft Excel for Mac (runs on OS X) seems to use it when exporting to "Comma Separated Values" (although exporting to "Windows Comma Separated" makes it easier to -to- handle \r\n ).

Python has "universal newline support" and can handle any of these three formats without problems. Is there something similar in Ruby that will accept all three without knowing the format in advance?

+7
ruby file newline csv line-endings
source share
1 answer

You can use :row_sep => :auto :

: row_sep
A line is added to the end of each line. This can be configured with a special parameter :auto , which requires the CSV to automatically detect this from the data. Automatic discovery reads data in search of the following sequence "\r\n" , "\n" or "\r" .

There are some reservations, of course, see the manual linked above for details.

You can also manually clear the EOL with the gsub bit before transferring the data to CSV for parsing. I would probably take this route and manually convert all \r\n and \r to single \n before trying to parse the CSV. OTOH, this will not work if the CSV has embedded binary data, where \r means something. On the grasping arm, this is the CSV we are dealing with, and who knows what crazy broken stupidity you will encounter.

+17
source share

All Articles