I have several processes that are constantly updated in Redshift. They start the transaction, create a new table, COPY all the data from S3 to the new table, then discard the old table and rename the new table to the old table.
pseudo code:
start transaction; create table foo_temp; copy into foo_temp from S3; drop table foo; rename table foo_temp to foo; commit;
I have dozens of tables that I update this way. This works well, but I would like to have several processes that perform these table updates for redundancy purposes, and ensure that the data is fresh enough (different processes can update data for different tables at the same time).
It works fine if only one process is not trying to update the table in which another process is running. In this case, the second process is blocked by the first until it completes, and when it completes the second process, it receives an error:
ERROR: table 12345 omitted by concurrent transaction
Is there an easy way to ensure that only one of my processes updates the table so that the second process does not fall into this situation?
I thought about creating a special lock table for each of my real tables. Before working on a real companion table, the process will LOCK special lock table. I think this will work, but I would like to avoid creating a special lock table for each of my tables.
source share