What @ Satya claims in his comment is not entirely true. If there is a compliance index, the scheduler selects only a full table scan, if the statistics table assumes that it will return more than 5% (depends) on the table, because then itβs faster to scan the entire table.
As you can see on your own question, this does not apply to your request. It uses a raster image scan, followed by a raster map scan. Although I would expect a simple index scan. (?)
I noticed two more things in your explanation:
The first scan gets 832 lines, and the second reduces the score to 739. This will indicate that you have a lot of dead tuples in your index.
Check the runtime after each step with EXPLAIN ANALYZE and maybe add the results to your question:
First restart the query with EXPLAIN ANALYZE two or three times to fill the cache. What is the result of the last run compared to the first?
Further:
VACUUM ANALYZE cars;
Restart
If you have many write operations in the table, I would set the fill factor below 100. For example:
ALTER TABLE cars SET (fillfactor=90);
If the line size is large or you have many write operations. Then:
VACUUM FULL ANALYZE cars;
It will take some time. Relocation.
Or, if you can afford it (and other important requests do not contradict the requirements):
CLUSTER cars USING index_cars_on_reference_id;
This overwrites the table in the physical order of the index, which should make this kind of query much faster.
Normalize the circuit
If you want this to be very fast, create the car_type table using the serial primary key and access it from the cars table. This will reduce the required index to the part that is now.
It goes without saying that you back up before trying any of this.
CREATE temp TABLE car_type ( car_type_id serial PRIMARY KEY , car_type text ); INSERT INTO car_type (car_type) SELECT DISTINCT car_type_id FROM cars ORDER BY car_type_id; ANALYZE car_type; CREATE UNIQUE INDEX car_type_uni_idx ON car_type (car_type);
Or if you want to figure it out:
CLUSTER cars USING cars_car_type_id_idx;
Your request will now look like this:
SELECT count(*) FROM cars WHERE car_type_id = (SELECT car_type_id FROM car_type WHERE car_type = 'toyota_hilux')
And it should be even faster. Mostly because the index and table are now smaller, but also because integer processing is faster than varchar processing. However, the gain will not be dramatic over the cluster table in the varchar column.
A welcome side effect: if you need to rename a type, now it's a tiny UPDATE for a single row, and not for sharing with a large table at all.