Re: Bad query plan when you add many OR conditions - Mailing list pgsql-performance

From Thomas Kellerer
Subject Re: Bad query plan when you add many OR conditions
Date
Msg-id 9748c027-a524-2718-28ec-3c6bc74feafb@gmx.net
Whole thread Raw
In response to Bad query plan when you add many OR conditions  (Marco Colli <collimarco91@gmail.com>)
List pgsql-performance
Marco Colli schrieb am 10.01.2020 um 02:11:
> I have a query on a large table that is very fast (0s):
> https://gist.github.com/collimarco/039412b4fe0dcf39955888f96eff29db#file-fast_query-txt
> 
> Basically the query matches the rows that have a tag1 OR tag2 OR tag3 OR tag4 OR tag5... 
> 
> However if you increase the number of OR at some point PostgreSQL makes the bad decision to change its query plan!
Andthe new plan makes the query terribly slow:
 
> https://gist.github.com/collimarco/039412b4fe0dcf39955888f96eff29db#file-slow_query-txt
> 
> Instead of this (which is fast):
>   Bitmap Index Scan on index_subscriptions_on_project_id_and_tags
> It starts using this (which is slow):
>   Parallel Index Scan using index_subscriptions_on_project_id_and_created_at
> The choice seems quite stupid since it doesn't have the tags on the new index... and indeed the query takes about 1
minuteinstead of a few milliseconds. Here's a list of the available indexes:
 
> https://gist.github.com/collimarco/039412b4fe0dcf39955888f96eff29db#file-_indexes-txt
> 
> How can I encourage PostgreSQL to use the Bitmap Index Scan even when there are many OR conditions? I have tried with
VACUUMANALYZE subscriptions but it doesn't help.
 
> 
> Note: the query is generated dynamically by customers of a SaaS, so I don't have full control on it

Can you replace the many ORs with a single "overlaps" comparison?

This 

    (tags @> ARRAY['crt:2018_04']::varchar[]) OR (tags @> ARRAY['crt:2018_05']::varchar[]) OR (tags @>
ARRAY['crt:2018_06']::varchar[])

is equivalent to 

    tags && array['crt:2018_04','crt:2018_05','crt:2018_06', ...]

The && operator can make use of a GIN index so maybe that uses the index_subscriptions_on_project_id_and_tags
regardlessof the number of elements.
 



 





pgsql-performance by date:

Previous
From: Shira Bezalel
Date:
Subject: Re: Seeking reason behind performance gain in 12 with HashAggregate
Next
From: Joao Junior
Date:
Subject: shared buffers and startup process