I have a table T1 with 60 rows and 5 columns: ID1, ID2, info1, info2, info3.
I have a T2 table with 1.2 million rows and five more columns: ID3, ID2, info4, info5, info6.
I want to get (ID1, ID2, info4, info5, info6) from all lines where ID2 matches. Currently my query looks like this:
SELECT T1.ID1, T2.ID2, T2.info4, T2.info5, T2.info6 FROM T1, T2 WHERE T1.ID2 = T2.ID2;
It takes about 15 seconds. My question is: should it take so long, and if not, how can I speed it up? I suppose this is not so, since T1 is so small.
I asked PostgreSQL to EXPLAIN the query, and it says that it hashes T2, and then the hash combines this hash with T1. T2 hashing seems to be something that takes so long. Is there a way to write a request so that it does not have a T2 hash? Or is there a way to cache the T2 hash so that it does not redo it? Tables will be updated every few days.
If that matters, T1 is a temporary table created earlier in the session.
performance sql join postgresql
Claudiu
source share