So basically I set up a very simple test pattern to test trigram capabilities and full-text indexing in postgresql 9.1 (stable version for Debian).
The following are table and index definitions:
CREATE TABLE fulltextproba
(
id integer NOT NULL,
text text,
varchar600 character varying(600) COLLATE pg_catalog."C.UTF-8",
CONSTRAINT id PRIMARY KEY (id )
)
WITH (
OIDS=FALSE
);
CREATE UNIQUE INDEX id_index
ON fulltextproba
USING btree
(id );
CREATE INDEX text_gin_fulltext_hun
ON fulltextproba
USING gin
(to_tsvector('hungarian'::text, text) );
CREATE INDEX text_gin_trgm
ON fulltextproba
USING gin
(text COLLATE pg_catalog."C.UTF-8" gin_trgm_ops);
CREATE INDEX varchar600
ON fulltextproba
USING btree
(varchar600 COLLATE pg_catalog."C.UTF-8" varchar_pattern_ops);
CREATE INDEX varchar600_gin_trgm
ON fulltextproba
USING gin
(varchar600 COLLATE pg_catalog."C.UTF-8" gin_trgm_ops);
My problem: if I perform a search %foo%that should use the trigram index, if I search in a text column, it is not:
SELECT COUNT(id) FROM public.fulltextproba WHERE text LIKE '%almáv%'
count
396
(1 row)
real 0m7.215s
user 0m0.020s
sys 0m0.004s
QUERY PLAN
Aggregate (cost=657056.11..657056.12 rows=1 width=4)
-> Seq Scan on fulltextproba (cost=0.00..657052.72 rows=1355 width=4)
Filter: (text ~~ '%almáv%'::text)
(3 rows)
But if I enter the varchar600 column, it uses the trigram index and is - not surprisingly - much faster:
SELECT COUNT(id) FROM public.fulltextproba WHERE varchar600 LIKE '%almáv%'
count
373
(1 row)
real 0m0.184s
user 0m0.052s
sys 0m0.004s
QUERY PLAN
Aggregate (cost=5283.11..5283.12 rows=1 width=4)
-> Bitmap Heap Scan on fulltextproba (cost=62.50..5279.73 rows=1355 width=4)
Recheck Cond: ((varchar600)::text ~~ '%almáv%'::text)
-> Bitmap Index Scan on varchar600_gin_trgm (cost=0.00..62.16 rows=1355 width=0)
Index Cond: ((varchar600)::text ~~ '%almáv%'::text)
(5 rows)
So, the final questions:
- Why postgres does not use the trigram index in a text column.
- How can postgres be used to use an index? Should I define it in any other way?