Should I use a "serial" primary key in case I ever need to?
You can easily add a sequential column later if you need it:
ALTER TABLE product_pricebands ADD COLUMN id serial;
The column will be filled with unique values ββautomatically. You can even make it the primary key in the same expression (if the primary key is not already defined):
ALTER TABLE product_pricebands ADD COLUMN id serial PRIMARY KEY;
If you are referencing a table from other tables, I would recommend using such a surrogate primary key , because it is rather cumbersome with four columns. It is also slower in SELECT with JOINs.
In any case, you must define the primary key . The UNIQUE index, including a nullable column, is not a complete replacement. It allows you to duplicate combinations, including the NULL value, because two NULL values ββare never considered the same. This can lead to trouble.
how
colourid field may be NULL
You can create two unique indexes . The combination (template_sku, siteid, currencyid, colourid) cannot be PRIMARY KEY due to the zero colourid value, but you can create a UNIQUE constraint as you already have (using the index automatically):
ALTER TABLE product_pricebands ADD CONSTRAINT product_pricebands_uni_idx UNIQUE (template_sku, siteid, currencyid, colourid)
This index perfectly covers the questions you mentioned in 2).
Create a partially unique index if you want to avoid "duplicates" using (colourid IS NULL) :
CREATE UNIQUE INDEX product_pricebands_uni_null_idx ON product_pricebands (template_sku, siteid, currencyid) WHERE colourid IS NULL;
To cover all bases. I wrote more about this technique in a related answer on dba.SE.
A simple alternative to the above is to make the colourid NOT NOT NULL and create a primary key instead of the above product_pricebands_uni_idx .
Also, since you
basically DELETE most of the data
for your replenishment operation, it will be faster to discard indexes that are not needed during the replenishment operation, and subsequently recreate them. It is faster by an order to create an index from scratch than to add all rows step by step.
How do you know which indexes are used (needed)?
- Test your queries with
EXPLAIN ANALYZE . - Or use the built-in statistics . pgAdmin displays statistics on a separate tab for the selected object.
It may also be faster to select multiple rows with my_custom_field = TRUE in the temporary table, TRUNCATE base table and re-DELETE the survivors. Depends on whether you have foreign keys. It will look like this:
CREATE TEMP TABLE pr_tmp AS SELECT * FROM product_pricebands WHERE my_custom_field; TRUNCATE product_pricebands; INSERT INTO product_pricebands SELECT * FROM pr_tmp;
This avoids a lot of evacuation.