Thread: Suboptimal query plans for BETWEEN SYMMETRIC operations
PostgreSQL version: 17.0
OS: macOS 14.7.1
Description:
A condition: "col BETWEEN SYMMETRIC val1 AND val2" is currently rewritten to "(((col >= val1) AND (col <= val2)) OR ((col >= val2) AND (col <= val1)))" that would lead to suboptimal plans using an extra Bitmap Index Scan or Index Scan/Index Only Scan with the entire predicate placed in the "Filter" instead of "Index Cond".
Rewriting it to "col BETWEEN LEAST(val1, val2) AND GREATEST(val2, val1)" first helps produce simpler and more efficient plans.
Example:
postgres=# select version();
version
------------------------------------------------------------------------------------------------------------------
PostgreSQL 17.0 on x86_64-apple-darwin23.6.0, compiled by Apple clang version 16.0.0 (clang-1600.0.26.4), 64-bit
(1 row)
postgres=# create table t (c int);
postgres=# create index i_t_c on t (c);
*** two Bitmap Index Scans + BitmapOr ***
postgres=# explain select * from t where c between symmetric 10 and 1;
QUERY PLAN
---------------------------------------------------------------------------
Bitmap Heap Scan on t (cost=8.58..19.10 rows=25 width=4)
Recheck Cond: (((c >= 10) AND (c <= 1)) OR ((c >= 1) AND (c <= 10)))
-> BitmapOr (cost=8.58..8.58 rows=26 width=0)
-> Bitmap Index Scan on i_t_c (cost=0.00..4.29 rows=13 width=0)
Index Cond: ((c >= 10) AND (c <= 1))
-> Bitmap Index Scan on i_t_c (cost=0.00..4.29 rows=13 width=0)
Index Cond: ((c >= 1) AND (c <= 10))
(7 rows)
postgres=# set enable_bitmapscan=off;
postgres=# explain select * from t where c between symmetric 10 and 1;
QUERY PLAN
------------------------------------------------------------------
Seq Scan on t (cost=0.00..61.00 rows=25 width=4)
Filter: (((c >= 10) AND (c <= 1)) OR ((c >= 1) AND (c <= 10)))
(2 rows)
postgres=# set enable_seqscan=off;
*** Index Only Scan scanning the entire index ***
postgres=# explain select * from t where c between symmetric 10 and 1;
QUERY PLAN
--------------------------------------------------------------------
Index Only Scan using i_t_c on t (cost=0.12..4.15 rows=1 width=4)
Filter: (((c >= 10) AND (c <= 1)) OR ((c >= 1) AND (c <= 10)))
(2 rows)
*** more efficient plans with the alternative form ***
postgres=# set enable_bitmapscan=on;
postgres=# explain select * from t where c between least(10, 1) and greatest(10, 1);
QUERY PLAN
---------------------------------------------------------------------
Bitmap Heap Scan on t (cost=4.29..15.02 rows=13 width=4)
Recheck Cond: ((c >= 1) AND (c <= 10))
-> Bitmap Index Scan on i_t_c (cost=0.00..4.29 rows=13 width=0)
Index Cond: ((c >= 1) AND (c <= 10))
(4 rows)
postgres=# set enable_bitmapscan=off;
postgres=# explain select * from t where c between least(10, 1) and greatest(10, 1);
QUERY PLAN
----------------------------------------------------------------------
Index Only Scan using i_t_c on t (cost=0.15..36.42 rows=13 width=4)
Index Cond: ((c >= 1) AND (c <= 10))
(2 rows)
OS: macOS 14.7.1
Description:
A condition: "col BETWEEN SYMMETRIC val1 AND val2" is currently rewritten to "(((col >= val1) AND (col <= val2)) OR ((col >= val2) AND (col <= val1)))" that would lead to suboptimal plans using an extra Bitmap Index Scan or Index Scan/Index Only Scan with the entire predicate placed in the "Filter" instead of "Index Cond".
Rewriting it to "col BETWEEN LEAST(val1, val2) AND GREATEST(val2, val1)" first helps produce simpler and more efficient plans.
Example:
postgres=# select version();
version
------------------------------------------------------------------------------------------------------------------
PostgreSQL 17.0 on x86_64-apple-darwin23.6.0, compiled by Apple clang version 16.0.0 (clang-1600.0.26.4), 64-bit
(1 row)
postgres=# create table t (c int);
postgres=# create index i_t_c on t (c);
*** two Bitmap Index Scans + BitmapOr ***
postgres=# explain select * from t where c between symmetric 10 and 1;
QUERY PLAN
---------------------------------------------------------------------------
Bitmap Heap Scan on t (cost=8.58..19.10 rows=25 width=4)
Recheck Cond: (((c >= 10) AND (c <= 1)) OR ((c >= 1) AND (c <= 10)))
-> BitmapOr (cost=8.58..8.58 rows=26 width=0)
-> Bitmap Index Scan on i_t_c (cost=0.00..4.29 rows=13 width=0)
Index Cond: ((c >= 10) AND (c <= 1))
-> Bitmap Index Scan on i_t_c (cost=0.00..4.29 rows=13 width=0)
Index Cond: ((c >= 1) AND (c <= 10))
(7 rows)
postgres=# set enable_bitmapscan=off;
postgres=# explain select * from t where c between symmetric 10 and 1;
QUERY PLAN
------------------------------------------------------------------
Seq Scan on t (cost=0.00..61.00 rows=25 width=4)
Filter: (((c >= 10) AND (c <= 1)) OR ((c >= 1) AND (c <= 10)))
(2 rows)
postgres=# set enable_seqscan=off;
*** Index Only Scan scanning the entire index ***
postgres=# explain select * from t where c between symmetric 10 and 1;
QUERY PLAN
--------------------------------------------------------------------
Index Only Scan using i_t_c on t (cost=0.12..4.15 rows=1 width=4)
Filter: (((c >= 10) AND (c <= 1)) OR ((c >= 1) AND (c <= 10)))
(2 rows)
*** more efficient plans with the alternative form ***
postgres=# set enable_bitmapscan=on;
postgres=# explain select * from t where c between least(10, 1) and greatest(10, 1);
QUERY PLAN
---------------------------------------------------------------------
Bitmap Heap Scan on t (cost=4.29..15.02 rows=13 width=4)
Recheck Cond: ((c >= 1) AND (c <= 10))
-> Bitmap Index Scan on i_t_c (cost=0.00..4.29 rows=13 width=0)
Index Cond: ((c >= 1) AND (c <= 10))
(4 rows)
postgres=# set enable_bitmapscan=off;
postgres=# explain select * from t where c between least(10, 1) and greatest(10, 1);
QUERY PLAN
----------------------------------------------------------------------
Index Only Scan using i_t_c on t (cost=0.15..36.42 rows=13 width=4)
Index Cond: ((c >= 1) AND (c <= 10))
(2 rows)
On Fri, 8 Nov 2024 at 08:36, Mineharu Takahara <mtakahara@yugabyte.com> wrote: > A condition: "col BETWEEN SYMMETRIC val1 AND val2" is currently rewritten to "(((col >= val1) AND (col <= val2)) OR ((col>= val2) AND (col <= val1)))" that would lead to suboptimal plans using an extra Bitmap Index Scan or Index Scan/IndexOnly Scan with the entire predicate placed in the "Filter" instead of "Index Cond". This isn't a bug, it's just something that could perhaps be made more optimal. If you're interested in making improvements in this area for core PostgreSQL, then pgsql-hackers is the place to discuss that. David
David Rowley <dgrowleyml@gmail.com> writes: > On Fri, 8 Nov 2024 at 08:36, Mineharu Takahara <mtakahara@yugabyte.com> wrote: >> A condition: "col BETWEEN SYMMETRIC val1 AND val2" is currently rewritten to "(((col >= val1) AND (col <= val2)) OR ((col>= val2) AND (col <= val1)))" that would lead to suboptimal plans using an extra Bitmap Index Scan or Index Scan/IndexOnly Scan with the entire predicate placed in the "Filter" instead of "Index Cond". > This isn't a bug, it's just something that could perhaps be made more optimal. Indeed. The trouble with the LEAST/GREATEST formulation is that it may result in different semantics in situations where val1 and val2 aren't the same type. Also, LEAST/GREATEST rely on the default btree opclass for the common type, which might not match the semantics of the comparison operators that the current coding chooses. There are ways around that --- one could be to transform to LEAST/GREATEST only when the arguments do resolve as the same type. And perhaps you could convince people that BETWEEN ought to depend on the default btree opclass not on operator names. But it's all a lot messier than you might think. > If you're interested in making improvements in this area for core > PostgreSQL, then pgsql-hackers is the place to discuss that. Yup. regards, tom lane
> > This isn't a bug, it's just something that could perhaps be made more optimal.
>
> Indeed.
> Indeed.
I agree. I should've clarified upfront that this was rather an enhancement request. Or is there some specific way of tagging enhancement requests / improvement suggestions that I can follow?
> The trouble with the LEAST/GREATEST formulation is that it may result
in different semantics in situations where val1 and val2 aren't thesame type. Also, LEAST/GREATEST rely on the default btree opclass
for the common type, which might not match the semantics of the
comparison operators that the current coding chooses.
Great points. Thanks for the insights!
> If you're interested in making improvements in this area for core
> PostgreSQL, then pgsql-hackers is the place to discuss that.
> PostgreSQL, then pgsql-hackers is the place to discuss that.
Got it, thanks!
On Thu, Nov 7, 2024 at 6:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Rowley <dgrowleyml@gmail.com> writes:
> On Fri, 8 Nov 2024 at 08:36, Mineharu Takahara <mtakahara@yugabyte.com> wrote:
>> A condition: "col BETWEEN SYMMETRIC val1 AND val2" is currently rewritten to "(((col >= val1) AND (col <= val2)) OR ((col >= val2) AND (col <= val1)))" that would lead to suboptimal plans using an extra Bitmap Index Scan or Index Scan/Index Only Scan with the entire predicate placed in the "Filter" instead of "Index Cond".
> This isn't a bug, it's just something that could perhaps be made more optimal.
Indeed.
The trouble with the LEAST/GREATEST formulation is that it may result
in different semantics in situations where val1 and val2 aren't the
same type. Also, LEAST/GREATEST rely on the default btree opclass
for the common type, which might not match the semantics of the
comparison operators that the current coding chooses.
There are ways around that --- one could be to transform to
LEAST/GREATEST only when the arguments do resolve as the same type.
And perhaps you could convince people that BETWEEN ought to depend
on the default btree opclass not on operator names. But it's all
a lot messier than you might think.
> If you're interested in making improvements in this area for core
> PostgreSQL, then pgsql-hackers is the place to discuss that.
Yup.
regards, tom lane