Postgresql does not use the trigram index in the text column, but uses it in the varchar column

Question

Postgresql does not use the trigram index in the text column, but uses it in the varchar column

So basically I set up a very simple test pattern to test trigram capabilities and full-text indexing in postgresql 9.1 (stable version for Debian).

The following are table and index definitions:

-- Table: fulltextproba
-- DROP TABLE fulltextproba;
CREATE TABLE fulltextproba
(
  id integer NOT NULL,
  text text,
  varchar600 character varying(600) COLLATE pg_catalog."C.UTF-8",
  CONSTRAINT id PRIMARY KEY (id )
)
WITH (
  OIDS=FALSE
);

-- Index: id_index
-- DROP INDEX id_index;
CREATE UNIQUE INDEX id_index
  ON fulltextproba
  USING btree
  (id );

-- Index: text_gin_fulltext_hun
-- DROP INDEX text_gin_fulltext_hun;
CREATE INDEX text_gin_fulltext_hun
  ON fulltextproba
  USING gin
  (to_tsvector('hungarian'::text, text) );

-- Index: text_gin_trgm
-- DROP INDEX text_gin_trgm;
CREATE INDEX text_gin_trgm
  ON fulltextproba
  USING gin
  (text COLLATE pg_catalog."C.UTF-8" gin_trgm_ops);

-- Index: varchar600
-- DROP INDEX varchar600;
CREATE INDEX varchar600
  ON fulltextproba
  USING btree
  (varchar600 COLLATE pg_catalog."C.UTF-8" varchar_pattern_ops);

-- Index: varchar600_gin_trgm
-- DROP INDEX varchar600_gin_trgm;
CREATE INDEX varchar600_gin_trgm
  ON fulltextproba
  USING gin
  (varchar600 COLLATE pg_catalog."C.UTF-8" gin_trgm_ops);

My problem: if I perform a search %foo%that should use the trigram index, if I search in a text column, it is not:

SELECT COUNT(id) FROM public.fulltextproba WHERE text LIKE '%almáv%'
 count 
-------
   396
(1 row)

real    0m7.215s
user    0m0.020s
sys 0m0.004s
                                QUERY PLAN                                 
---------------------------------------------------------------------------
 Aggregate  (cost=657056.11..657056.12 rows=1 width=4)
   ->  Seq Scan on fulltextproba  (cost=0.00..657052.72 rows=1355 width=4)
         Filter: (text ~~ '%almáv%'::text)
(3 rows)

But if I enter the varchar600 column, it uses the trigram index and is - not surprisingly - much faster:

SELECT COUNT(id) FROM public.fulltextproba WHERE varchar600 LIKE '%almáv%'
 count 
-------
   373
(1 row)

real    0m0.184s
user    0m0.052s
sys 0m0.004s
                                         QUERY PLAN                                         
--------------------------------------------------------------------------------------------
 Aggregate  (cost=5283.11..5283.12 rows=1 width=4)
   ->  Bitmap Heap Scan on fulltextproba  (cost=62.50..5279.73 rows=1355 width=4)
         Recheck Cond: ((varchar600)::text ~~ '%almáv%'::text)
         ->  Bitmap Index Scan on varchar600_gin_trgm  (cost=0.00..62.16 rows=1355 width=0)
               Index Cond: ((varchar600)::text ~~ '%almáv%'::text)
(5 rows)

So, the final questions:

Why postgres does not use the trigram index in a text column.
How can postgres be used to use an index? Should I define it in any other way?

+4

indexing postgresql trigram

P.Péter 19 . '14 16:08

1

Erwin Brandstetter · Accepted Answer · 2014-08-19T17:27:24+0000

text . , EXPLAIN:

Index Cond: ((varchar600)::text ~~ '%almáv%'::text)

, , . :

text text,   -- default collation is ???
varchar600 character varying(600) COLLATE pg_catalog."C.UTF-8"

COLLATE pg_catalog."C.UTF-8". ? :

SHOW LC_COLLATE;

, . :

SELECT COUNT(id) FROM public.fulltextproba
WHERE text COLLATE pg_catalog."C.UTF-8" LIKE '%almáv%'

Postgres.

, , . .

Postgres , ( !):

SET enable_seqscan = off;

. :

Left Join

Outlook GIN Postgres 9.4

Postgres 9.4 GIN. , .

Postgresql does not use the trigram index in the text column, but uses it in the varchar column

Outlook GIN Postgres 9.4

More articles: