The best way to materialize performance

Question

The best way to materialize performance

(PostreSQL 8.2 running on WindowsXP)

I have many complex requests that take a few seconds to complete. They are not “views” in reality, but can be considered as such.

I decided to keep records of the results from these "views" in tables, which I call "auxiliary" tables.

I can guarantee that after calculating the "aux" tables there are no data changes.

Take one example:

Suppose I have a query “X”, so I save it in “Table_X”. The record set looks something like this:

PERSON* FIELD_A* FIELD_ B FIELD_C ======================================================= 1 10 Value1 Value2 1 20 Value3 Value4 1 30 Value5 Value6 ------------------------------------------------------ 2 10 Value1 Value2 2 20 Value3 Value4 ------------------------------------------------------ 3 20 Value3 Value4 3 30 Value5 Value6 ------------------------------------------------------ etc.. (*)Primary key is: person, field_a

As you can see, each “person” has its own subset of entries in this table.

So, I can quickly pick up his records only with "select * from table_x where person = <person>" .

I will always receive ONLY <person> , and all my requests have the same "face": "MAN" + "Some_Fields".

IMPORTANT: all "aux" tables can be read (obviously with "old" data until I complete) with other transactions, while I "re-populate" them. But I can guarantee that they will never be updated by these transactions.

My current process:

 - START TRANSACTION; - DO A LOTS OF OPERATIONS ON DATABASE. INSERT / UPDATE / DELETE ON SEVERAL TABLES. - AFTER THAT, I WILL CALCULATE "AUX" TABLES - LOOP THROUGH ALL MY "QUERIES": (WHERE HERE WE CAN CALL AS "X") - LOOP TROUGHT ALL "PERSON": (WHERE HERE WE CAN CALL AS <person>) - DELETE FROM <TABLE_X> WHERE PERSON = <person>; - INSERT INTO <TABLE_X> (PERSON, FIELD_A, FIELD_B, FIELD_C) (SELECT <person>, FIELDS... FROM "LOTS OF TABLES" JOIN "COMPLEX SQL"... WHERE SOME_FIELD = <person> ); - END LOOP "PERSON" - END LOOP "QUERIES" - COMMIT;

Questions:

Some of these tables contain thousands of records, and often only a few records need to be updated / deleted / inserted when compared with an already set record set in the table.

As deleting and “reinserting” causes too much “disk I / O” (so obvious) and I need to “update” just a few records, I am trying to get an efficient way to do this.

I tried deleting / updating / pasting in the divided steps, doing it directly from the “complex query”, but it takes too much time because the query was executed 3 times (one time to delete, another to update e another to insert).

Any suggestions?

+4

sql postgresql

Christian Aug 2 '09 at 4:27

source share

2 answers

Greg smith · Answer 1 · 2009-08-03T16:58:49+0000

Two standard links for creating your own materialized views using PostgreSQL: PostgreSQL / Materialized Views and Materialized Views that really work

tpdi · Answer 2 · 2009-08-02T04:45:15+0000

Before you do this, run an explanation plan for your complex query and add indexes to improve it?

If you must do this, forget all looping; nothing you do will be more optimized that internal C and database assembly code. Just write a presentation and materialize it if you want by selecting * from it in the table. In many cases, this will be faster than looping, deleting, and inserting.

The best way to materialize performance

More articles: