(PostreSQL 8.2 running on WindowsXP)
I have many complex requests that take a few seconds to complete. They are not “views” in reality, but can be considered as such.
I decided to keep records of the results from these "views" in tables, which I call "auxiliary" tables.
I can guarantee that after calculating the "aux" tables there are no data changes.
Take one example:
Suppose I have a query “X”, so I save it in “Table_X”. The record set looks something like this:
PERSON* FIELD_A* FIELD_ B FIELD_C ======================================================= 1 10 Value1 Value2 1 20 Value3 Value4 1 30 Value5 Value6 ------------------------------------------------------ 2 10 Value1 Value2 2 20 Value3 Value4 ------------------------------------------------------ 3 20 Value3 Value4 3 30 Value5 Value6 ------------------------------------------------------ etc.. (*)Primary key is: person, field_a
As you can see, each “person” has its own subset of entries in this table.
So, I can quickly pick up his records only with "select * from table_x where person = <person>" .
I will always receive ONLY <person> , and all my requests have the same "face": "MAN" + "Some_Fields".
IMPORTANT: all "aux" tables can be read (obviously with "old" data until I complete) with other transactions, while I "re-populate" them. But I can guarantee that they will never be updated by these transactions.
My current process:
- START TRANSACTION; - DO A LOTS OF OPERATIONS ON DATABASE. INSERT / UPDATE / DELETE ON SEVERAL TABLES. - AFTER THAT, I WILL CALCULATE "AUX" TABLES - LOOP THROUGH ALL MY "QUERIES": (WHERE HERE WE CAN CALL AS "X") - LOOP TROUGHT ALL "PERSON": (WHERE HERE WE CAN CALL AS <person>) - DELETE FROM <TABLE_X> WHERE PERSON = <person>; - INSERT INTO <TABLE_X> (PERSON, FIELD_A, FIELD_B, FIELD_C) (SELECT <person>, FIELDS... FROM "LOTS OF TABLES" JOIN "COMPLEX SQL"... WHERE SOME_FIELD = <person> ); - END LOOP "PERSON" - END LOOP "QUERIES" - COMMIT;
Questions:
Some of these tables contain thousands of records, and often only a few records need to be updated / deleted / inserted when compared with an already set record set in the table.
As deleting and “reinserting” causes too much “disk I / O” (so obvious) and I need to “update” just a few records, I am trying to get an efficient way to do this.
I tried deleting / updating / pasting in the divided steps, doing it directly from the “complex query”, but it takes too much time because the query was executed 3 times (one time to delete, another to update e another to insert).
Any suggestions?