Amazon Data Pipline: how to use script argument in SqlActivity?

When trying to use the Script argument in sqlActivity:

{ "id" : "ActivityId_3zboU", "schedule" : { "ref" : "DefaultSchedule" }, "scriptUri" : "s3://location_of_script/unload.sql", "name" : "unload", "runsOn" : { "ref" : "Ec2Instance" }, "scriptArgument" : [ "'s3://location_of_unload/#format(minusDays(@scheduledStartTime,1),'YYYY/MM/dd/hhmm/')}'", "'aws_access_key_id=????;aws_secret_access_key=*******'" ], "type" : "SqlActivity", "dependsOn" : { "ref" : "ActivityId_YY69k" }, "database" : { "ref" : "RedshiftCluster" } } 

where unload.sql Script contains:

  unload (' select * from tbl1 ') to ? credentials ? delimiter ',' GZIP; 

or:

  unload (' select * from tbl1 ') to ?::VARCHAR(255) credentials ?::VARCHAR(255) delimiter ',' GZIP; 

The process crashes:

 syntax error at or near "$1" Position 

Any idea what I'm doing wrong?

+7
amazon-s3 amazon-web-services amazon-redshift amazon-data-pipeline
source share
3 answers

I believe that you are using this sql operation for Redshift. Can you modify your sql script to refer to parameters using their positional notation. To refer to parameters in the sql expression itself, use $ 1, $ 2, etc.

See http://www.postgresql.org/docs/9.1/static/sql-prepare.html

+1
source share

This is a script that works fine with the psql shell:

 insert into tempsdf select * from source where source.id = '123'; 

Here are some of my tests in SqlActivity using Data-Pipelines:


Test 1: use ?

insert into mytable select * from source where source.id = ?; - works great if used with the "script" and "scriptURI" parameters on the SqlActivity object.

where is "ScriptArgument" : "123"

here? can replace the value of a condition, but not the condition itself.


Test 2: Using parameters works when a command is specified using the <script 'option

insert into #{myTable} select * from source where source.id = ?; - works fine if used only with the 'script option

 insert into #{myTable} select * from source where source.id = #{myId}; 
  • works fine if used only with the 'script option

where #{myTable} , #{myId} are parameters whose value can be declared in the template.

http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-custom-templates.html

(when you use only parameters, make sure you delete unused scriptArguments - otherwise it will still throw an error)


FAULTS AND TESTS:

insert? select * from source, where source.id = ?;

insert? select * from source, where source.id = '123';

Both of these commands do not work, because

Table names cannot be used for placeholders for script arguments. '?' can only be used to pass values ​​for a comparison condition and column values.


Paste into # {myTable} select * from source, where source.id = # {myId}; - does not work if used as "SciptURI"

paste in tempsdf select * from the source, where source.id = # {myId}; - does not work when using ScriptURI script

Above 2 commands do not work because

Parameters cannot be evaluated if the script is stored in S3.


paste in tempsdf select * from the source, where source.id = $ 1; - does not work with 'scriptURI'

paste in tempsdf values ​​($ 1, $ 2, $ 3); - does not work.

using $ - does not work in any combination


Other tests:

"ScriptArgument": "123" "ScriptArgument": "456" "ScriptArgument": "789"

insert into tempsdf values (?,?,?); - works like a scriptURI, script and translates to insert into tempsdf values ('123','456','789');

scriptArguments will follow the order you insert and replace "?" in the script.


+3
source share

in shellcommand we specify two scriptArux attributes for acces using $ 1 $ 2 in the shell script (. sh)

"scriptArgument": "'s3: // location_of_unload / # format (minusDays (@scheduleStartTime, 1),' YYYY / MM / dd / hhmm / ')}'", # can be used with $ 1 "scriptArgument": "' aws_access_key_id = ????; aws_secret_access_key = ******* '"# can be connected using $ 2

I do not know if this will work for you.

+1
source share

All Articles