Is there a useful way to index a text column containing regex patterns?

I am using PostgreSQL, currently version 9.2, but I am open to updating.

In one of my tables, I have a column of type text that stores regular expression patterns.

 CREATE TABLE foo ( id serial, pattern text, PRIMARY KEY(id) ); CREATE INDEX foo_pattern_idx ON foo(pattern); 

Then I make requests for it as follows:

 INSERT INTO foo (pattern) VALUES ('^abc.*$'); SELECT * FROM foo WHERE 'abc literal string' ~ pattern; 

I understand that this is a kind of reverse LIKE or reverse match. If it were a different, more general way, if my haystack was in the database and my needle was tied, I could use the btree metric more or less efficiently depending on the exact search pattern and data.

But the data that I have is a template table and other data related to the templates. I need to ask a database whose rows contain patterns that match my query text. Is there a way to make this more efficient than a sequential scan that checks every row in my table?

+2
operators regex pattern-matching indexing postgresql
source share
1 answer

There is no way .

Indexes require IMMUTABLE expressions. The result of your expression depends on the input string. I see no other way than evaluating the expression for each line, which means sequential scanning.

Related answer with more details for IMMUTABLE corner:

  • Does PostgreSQL support "accent insensitive" collation?

Just so that in your case there is no workaround that cannot be indexed. The index must store constant values ​​in its tuples, which is simply unavailable because the resulting value for each row is calculated based on the input. And you cannot convert the input without looking at the value of the column.

The use of the Postgres index is bound to operators, and only indexes on expressions to the left of the operator can be used (due to the same logical restrictions). More details:

  • Can PostgreSQL Index Column Columns?

Many operators define COMMUTATOR , which allows the query planner / optimizer to flip indexed expressions to the left. A simple example: Switch = equals = . switch > is equal to < and vice versa. Documentation:

the index scanning device expects to see the indexed column to the left of the operator that it sets.

The regex operator ~ does not have a switch, again, because this is not possible. See for yourself:

 SELECT oprname, oprright::regtype, oprleft::regtype, oprcom FROM pg_operator WHERE oprname = '~' AND 'text'::regtype IN (oprright, oprleft); oprname | oprright | oprleft | oprcom ---------+----------+-----------+------------ ~ | text | name | 0 ~ | text | text | 0 ~ | text | character | 0 ~ | text | citext | 0 

Refer to the manual here:

oprcom ... The switch of this operator, if any ...
An unused column contains zeros. For example, oprleft is zero for the prefix operator.

I tried before and had to take it impossible on the main one .

+2
source share

All Articles