In redshift postgresql you can skip columns with copy function

  • I have a .csv (t1) table with columns: c1, c2, c3 in Amazon S3 storage
  • I want to copy this to Amazon redshift
  • I create a table with columns: c1, c2, c3, where all columns are NULL
  • I copy the command:

    copy t1a (c1, c3) from t1

  • I expected it to copy c1 and c3 from t1 and put the default value of zero in c2 so that the line in t1a could look like (c1_rowX, null, c3_rowX).

  • Instead, I get a type error because it copies the data c2 (string type) from t1 to c3 (int type) from t1a.

  • the copy command works fine when I don't specify the columns:

    copy t1a from t1

  • I have included a link to the documentation for the red keys copy command:

http://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html

  • The main question is the question with my use of index columns. thanks
+8
postgresql amazon-redshift
source share
2 answers

If you want to skip part of the preprocessing, you can define the column to be skipped as CHAR(1) , and then use the TRUNCATECOLUMNS parameter for the COPY :

 CREATE TABLE t1a ( c1, c2 CHAR(1), c3 ); COPY t1a FROM t1 TRUNCATECOLUMNS 

TRUNCATECOLUMNS ignores all data that is longer than defined in the table schema during import, so all data in this column will be truncated by 1 character.

It is recommended that you only crack the source preprocessing file, but sometimes you need to crack everything.

+10
source share

Detailed column mappings are not supported directly in COPY unless Amazon / ParAccel has added something extra to their PostgreSQL fork, which is not in mainline.

What you usually do in this case is to do a massage / data overlay script - read the CSV, convert it as desired, and then send the line through the PostgreSQL connection using the COPY ... FROM STDIN .

Alternatively, you can COPY entire source CSV to a table, and then convert this data using INSERT INTO ... SELECT into a real target table.

+4
source share

All Articles