Thread: Gin indexes on intarray is fast when value in array does not exists, and slow, when value exists

I have table with 500 000 rows, I have int[] column "my_array" in this table, this array column contains minimum 1 and maximum 5 different values.

I have GIN index on my_array column:

 "CREATE INDEX idx ON table_name USING GIN(my_array gin__int_ops)"

Then I use this query: "SELECT * FROM  table_name WHERE my_array @> '{3}'::integer[]  ORDER BY id LIMIT 50"

Execution time of this query is approximately 500-1000 ms. Then if I drop gin index "idx", query works extremely fast, less than 20 ms.

But, if I search value, which does not exists at all, for example no one array not contains number "77" and I search:  "WHERE my_array @> '{77}'::integer[]" , then using gin index is much better and fast, (less than 20 ms), but without index, query takes 500-1000 ms.


So, what to do? For values which does not in any one rows, using index is much better, but for values,  which are at least in several rows, using  index, slows down performance.

Can somehow make, that searching was always fast (when value exist in array and when not)




On Tue, Nov 8, 2016 at 8:43 PM, otar shavadze <oshavadze@gmail.com> wrote:
I have table with 500 000 rows, I have int[] column "my_array" in this table, this array column contains minimum 1 and maximum 5 different values.

you didn't show us postgres version.
 

I have GIN index on my_array column:

 "CREATE INDEX idx ON table_name USING GIN(my_array gin__int_ops)"

Then I use this query: "SELECT * FROM  table_name WHERE my_array @> '{3}'::integer[]  ORDER BY id LIMIT 50"

Execution time of this query is approximately 500-1000 ms. Then if I drop gin index "idx", query works extremely fast, less than 20 ms.

explain analyze would help us to see the problem.

 

But, if I search value, which does not exists at all, for example no one array not contains number "77" and I search:  "WHERE my_array @> '{77}'::integer[]" , then using gin index is much better and fast, (less than 20 ms), but without index, query takes 500-1000 ms.


So, what to do? For values which does not in any one rows, using index is much better, but for values,  which are at least in several rows, using  index, slows down performance.

Can somehow make, that searching was always fast (when value exist in array and when not)



On Tue, Nov 8, 2016 at 9:43 AM, otar shavadze <oshavadze@gmail.com> wrote:
I have table with 500 000 rows, I have int[] column "my_array" in this table, this array column contains minimum 1 and maximum 5 different values.

I have GIN index on my_array column:

 "CREATE INDEX idx ON table_name USING GIN(my_array gin__int_ops)"

Then I use this query: "SELECT * FROM  table_name WHERE my_array @> '{3}'::integer[]  ORDER BY id LIMIT 50"

Execution time of this query is approximately 500-1000 ms. Then if I drop gin index "idx", query works extremely fast, less than 20 ms.

Please post the information requested here: https://wiki.postgresql.org/wiki/Slow_Query_Questions

Also, can you show, 

select * from pg_stats where tablename ='table_name' and attname='my_array' \x\g\x

Cheers,

Jeff