Performance improvement for queries with IN clause - Mailing list pgsql-hackers

From Rafia Sabih
Subject Performance improvement for queries with IN clause
Date
Msg-id CA+FpmFeUTPQ2RnBO-e2OHxxi4imjH++gjq2tXN1KQA0uw-5jUA@mail.gmail.com
Whole thread Raw
Responses Re: Performance improvement for queries with IN clause  (Andreas Karlsson <andreas@proxel.se>)
List pgsql-hackers
Hello all,

I would like to direct your attention to the queries of following type,
select <some_column(s)>
from <table_name>
where <some_column> IN (<a_list_of_some_values>)

the plan for such a query uses index scan (or index-only), now in our experiments, if the provided list is sorted then query performance improves by ~10%. Which makes sense also as once we have found the required btree leaf we just keep moving in one direction, which should be expectantly less time consuming than searching the tree again.

Now, my question is shouldn't we always use this list in sorted order, in other words can there be scenarios where such a sorting will not help? I am talking about only the cases where the list consists of all constants and could fit in memory. Basically, when we are transforming the in expression and found that it consists of all constants, then sort it as well, codewise at transfromAExprIn, of course there might be better ways to accomplish this.

So, your thoughts, opinions, suggestions are more than welcome.

--
Regards,
Rafia Sabih

pgsql-hackers by date:

Previous
From: Julien Rouhaud
Date:
Subject: Re: Monitoring disk space from within the server
Next
From: Julien Rouhaud
Date:
Subject: Re: Monitoring disk space from within the server