PostgreSQL actually supports GIN indexes for array columns. Unfortunately, it is not suitable for NOT ARRAY[...] <@ indexed_col , and GIN indexes are not suitable for frequently updated tables.
Demo:
CREATE TABLE arrtable (id integer primary key, array_column integer[]); INSERT INTO arrtable(1, ARRAY[1,2,3,4]); CREATE INDEX arrtable_arraycolumn_gin_arr_idx ON arrtable USING GIN(array_column);
Unfortunately, this shows that, as written, we cannot use the index. If you do not deny the condition that it can be used, you can search and count the strings containing the search element (by removing NOT ).
You can use the index to count records containing the target value, and then subtract this result from all records. Since count all the rows in the table are rather slow in PostgreSQL (9.1 and older) and require sequential scans, this will be actually slower than your current query. It is possible that in 9.2 you can use only index scanning for row counting, if you have a b-tree index on id , in which case it could really be OK:
SELECT ( SELECT count(id) FROM arrtable ) - ( SELECT count(id) FROM arrtable WHERE (ARRAY[1] <@ arrtable.array_column) );
Guaranteed execution is worse than the original version for Pg 9.1 and below, because in addition to seqscan, your original also requires a GIN index scan. I tested this on 9.2, and it looks like it is using the index for the account, so it's worth exploring it for 9.2. With some less trivial dummy data:
drop index arrtable_arraycolumn_gin_arr_idx ; truncate table arrtable; insert into arrtable (id, array_column) select s, ARRAY[1,2,s,s*2,s*3,s/2,s/4] FROM generate_series(1,1000000) s; CREATE INDEX arrtable_arraycolumn_gin_arr_idx ON arrtable USING GIN(array_column);
Note that a GIN index like this slows down updates using LOT, and at the beginning it is rather slow to create. It is not suitable for tables that are updated quite like your table.
Worse, a query using this index takes twice as much as your original query and, at best, half as much in the same dataset. This is the worst case where the index is not very selective, like ARRAY[1] - 4s versus 2s for the original query. If the index is very selective (that is, there are not so many matches, for example ARRAY[199] ), it works after about 1.2 seconds compared to the original 3. This index is simply not worth having for this query.
Is the lesson here? Sometimes the correct answer is simply to perform a sequential scan.
Since this will not be done for your hit rates, either support the materialized view with a trigger, as @debenhur suggests, or try to invert the array as a list of parameters that are not in the record so that you can use the GiST Index as @maniek suggests.