Thread: Performance improvement for queries with IN clause

Performance improvement for queries with IN clause

From
Rafia Sabih
Date:
Hello all,

I would like to direct your attention to the queries of following type,
select <some_column(s)>
from <table_name>
where <some_column> IN (<a_list_of_some_values>)

the plan for such a query uses index scan (or index-only), now in our experiments, if the provided list is sorted then query performance improves by ~10%. Which makes sense also as once we have found the required btree leaf we just keep moving in one direction, which should be expectantly less time consuming than searching the tree again.

Now, my question is shouldn't we always use this list in sorted order, in other words can there be scenarios where such a sorting will not help? I am talking about only the cases where the list consists of all constants and could fit in memory. Basically, when we are transforming the in expression and found that it consists of all constants, then sort it as well, codewise at transfromAExprIn, of course there might be better ways to accomplish this.

So, your thoughts, opinions, suggestions are more than welcome.

--
Regards,
Rafia Sabih

Re: Performance improvement for queries with IN clause

From
Andreas Karlsson
Date:
On 11/8/19 2:52 PM, Rafia Sabih wrote:
> Now, my question is shouldn't we always use this list in sorted order, 
> in other words can there be scenarios where such a sorting will not 
> help? I am talking about only the cases where the list consists of all 
> constants and could fit in memory. Basically, when we are 
> transforming the in expression and found that it consists of all 
> constants, then sort it as well, codewise at transfromAExprIn, of course 
> there might be better ways to accomplish this.
> 
> So, your thoughts, opinions, suggestions are more than welcome.

If it is worth sorting them should depend on the index, e.g. for hash 
indexes sorting would just be a waste of time.

Andreas



Re: Performance improvement for queries with IN clause

From
Amit Kapila
Date:
On Sat, Nov 9, 2019 at 5:22 PM Andreas Karlsson <andreas@proxel.se> wrote:
>
> On 11/8/19 2:52 PM, Rafia Sabih wrote:
> > Now, my question is shouldn't we always use this list in sorted order,
> > in other words can there be scenarios where such a sorting will not
> > help? I am talking about only the cases where the list consists of all
> > constants and could fit in memory. Basically, when we are
> > transforming the in expression and found that it consists of all
> > constants, then sort it as well, codewise at transfromAExprIn, of course
> > there might be better ways to accomplish this.
> >
> > So, your thoughts, opinions, suggestions are more than welcome.
>
> If it is worth sorting them should depend on the index, e.g. for hash
> indexes sorting would just be a waste of time.
>

I think we also need to be careful that this might lead to regression
for cases where the list is already sorted.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Performance improvement for queries with IN clause

From
Rafia Sabih
Date:
On Sat, 9 Nov 2019 at 12:52, Andreas Karlsson <andreas@proxel.se> wrote:
On 11/8/19 2:52 PM, Rafia Sabih wrote:
> Now, my question is shouldn't we always use this list in sorted order,
> in other words can there be scenarios where such a sorting will not
> help? I am talking about only the cases where the list consists of all
> constants and could fit in memory. Basically, when we are
> transforming the in expression and found that it consists of all
> constants, then sort it as well, codewise at transfromAExprIn, of course
> there might be better ways to accomplish this.
>
> So, your thoughts, opinions, suggestions are more than welcome.

If it is worth sorting them should depend on the index, e.g. for hash
indexes sorting would just be a waste of time.

Hi Andreas, 
Thanks for your response. Here, I meant this list sorting only for Btree, as you well pointed out that for other indexes like hash this wouldn't really make sense. 


--
Regards,
Rafia Sabih