Amazon Redshift - COPY from CSV - single double quote per line - Invalid quote formatting for CSV Error

I am uploading a CSV file from S3 to Redshift. This CSV file is analytic data that contains the PageUrl file (which may contain information about the user's search inside the query string, for example).

It pinches lines where there is one double-quoted character, for example, if there is a page for a 14-inch toy, then the Page url page will contain:

http://www.mywebsite.com/a-14 "-toy / 1234.html

Redshift, for obvious reasons, cannot handle this because it expects a double double quote character.

As I can see, my options are:

  • Pre-process input and delete these characters
  • Configure the COPY command in Redshift to ignore these characters but still load the string
  • Set MAXERRORS to a high value and execute the errors using a separate process.

Option 2 would be perfect, but I can't find it!

Any other suggestions if I just don't look complicated enough?

thanks

Duncan

+8
source share
3 answers

Unfortunately, there is no way to fix this. Before uploading to Amazon Redshift, you will need to complete a preliminary process .

The closest parameters you have are CSV [ QUOTE [AS] 'quote_character' ] for wrapping fields in an alternate quote loop and ESCAPE if the quote character is preceded by a slash. Alas, both require the file to be in a specific format before downloading.

Cm:

+5
source

This is 2017, and I encountered the same problem, I am glad to report that now there is a way to get the redshift for loading CSV files with odd data.

The trick is to use the ESCAPE keyword, and also NOT to use the CSV keyword. I donโ€™t know why, but sharing CSV and ESCAPE keywords in the copy command resulted in an error with the error message "CSV is not compatible with ESCAPE;" However, without changing the downloaded data, I was able to successfully download it after removing the CSV keyword from the COPY command.

You can also refer to this documentation for help: http://docs.aws.amazon.com/redshift/latest/dg/copy-parameters-data-conversion.html#copy-escape

+8
source

This is the beginning of 2019, and the problem still persists. However, the solution posted by ayeletd still works for me - not only for the odd ones in the data, but for all kinds of funny characters, such as:

โ€ž",! () - '"" ""? @ :. "" โ„ข ยฎ & + # |% / []

-1
source

All Articles