Suboptimal query plans for BETWEEN SYMMETRIC operations

Started by Mineharu Takaharaover 1 year ago4 messagesbugs
Jump to latest
#1Mineharu Takahara
mtakahara@yugabyte.com

PostgreSQL version: 17.0
OS: macOS 14.7.1

Description:

A condition: "col BETWEEN SYMMETRIC val1 AND val2" is currently rewritten
to "(((col >= val1) AND (col <= val2)) OR ((col >= val2) AND (col <=
val1)))" that would lead to suboptimal plans using an extra Bitmap Index
Scan or Index Scan/Index Only Scan with the entire predicate placed in the
"Filter" instead of "Index Cond".

Rewriting it to "col BETWEEN LEAST(val1, val2) AND GREATEST(val2, val1)"
first helps produce simpler and more efficient plans.

Example:

postgres=# select version();
version

------------------------------------------------------------------------------------------------------------------
PostgreSQL 17.0 on x86_64-apple-darwin23.6.0, compiled by Apple clang
version 16.0.0 (clang-1600.0.26.4), 64-bit
(1 row)

postgres=# create table t (c int);
postgres=# create index i_t_c on t (c);

*** two Bitmap Index Scans + BitmapOr ***
postgres=# explain select * from t where c between symmetric 10 and 1;

QUERY PLAN
---------------------------------------------------------------------------
Bitmap Heap Scan on t (cost=8.58..19.10 rows=25 width=4)
Recheck Cond: (((c >= 10) AND (c <= 1)) OR ((c >= 1) AND (c <= 10)))
-> BitmapOr (cost=8.58..8.58 rows=26 width=0)
-> Bitmap Index Scan on i_t_c (cost=0.00..4.29 rows=13 width=0)
Index Cond: ((c >= 10) AND (c <= 1))
-> Bitmap Index Scan on i_t_c (cost=0.00..4.29 rows=13 width=0)
Index Cond: ((c >= 1) AND (c <= 10))
(7 rows)

postgres=# set enable_bitmapscan=off;
postgres=# explain select * from t where c between symmetric 10 and 1;

QUERY PLAN
------------------------------------------------------------------
Seq Scan on t (cost=0.00..61.00 rows=25 width=4)
Filter: (((c >= 10) AND (c <= 1)) OR ((c >= 1) AND (c <= 10)))
(2 rows)

postgres=# set enable_seqscan=off;

*** Index Only Scan scanning the entire index ***
postgres=# explain select * from t where c between symmetric 10 and 1;

QUERY PLAN
--------------------------------------------------------------------
Index Only Scan using i_t_c on t (cost=0.12..4.15 rows=1 width=4)
Filter: (((c >= 10) AND (c <= 1)) OR ((c >= 1) AND (c <= 10)))
(2 rows)

*** more efficient plans with the alternative form ***

postgres=# set enable_bitmapscan=on;
postgres=# explain select * from t where c between least(10, 1) and
greatest(10, 1);

QUERY PLAN
---------------------------------------------------------------------
Bitmap Heap Scan on t (cost=4.29..15.02 rows=13 width=4)
Recheck Cond: ((c >= 1) AND (c <= 10))
-> Bitmap Index Scan on i_t_c (cost=0.00..4.29 rows=13 width=0)
Index Cond: ((c >= 1) AND (c <= 10))
(4 rows)

postgres=# set enable_bitmapscan=off;
postgres=# explain select * from t where c between least(10, 1) and
greatest(10, 1);

QUERY PLAN
----------------------------------------------------------------------
Index Only Scan using i_t_c on t (cost=0.15..36.42 rows=13 width=4)
Index Cond: ((c >= 1) AND (c <= 10))
(2 rows)

#2David Rowley
dgrowleyml@gmail.com
In reply to: Mineharu Takahara (#1)
Re: Suboptimal query plans for BETWEEN SYMMETRIC operations

On Fri, 8 Nov 2024 at 08:36, Mineharu Takahara <mtakahara@yugabyte.com> wrote:

A condition: "col BETWEEN SYMMETRIC val1 AND val2" is currently rewritten to "(((col >= val1) AND (col <= val2)) OR ((col >= val2) AND (col <= val1)))" that would lead to suboptimal plans using an extra Bitmap Index Scan or Index Scan/Index Only Scan with the entire predicate placed in the "Filter" instead of "Index Cond".

This isn't a bug, it's just something that could perhaps be made more optimal.

If you're interested in making improvements in this area for core
PostgreSQL, then pgsql-hackers is the place to discuss that.

David

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#2)
Re: Suboptimal query plans for BETWEEN SYMMETRIC operations

David Rowley <dgrowleyml@gmail.com> writes:

On Fri, 8 Nov 2024 at 08:36, Mineharu Takahara <mtakahara@yugabyte.com> wrote:

A condition: "col BETWEEN SYMMETRIC val1 AND val2" is currently rewritten to "(((col >= val1) AND (col <= val2)) OR ((col >= val2) AND (col <= val1)))" that would lead to suboptimal plans using an extra Bitmap Index Scan or Index Scan/Index Only Scan with the entire predicate placed in the "Filter" instead of "Index Cond".

This isn't a bug, it's just something that could perhaps be made more optimal.

Indeed.

The trouble with the LEAST/GREATEST formulation is that it may result
in different semantics in situations where val1 and val2 aren't the
same type. Also, LEAST/GREATEST rely on the default btree opclass
for the common type, which might not match the semantics of the
comparison operators that the current coding chooses.

There are ways around that --- one could be to transform to
LEAST/GREATEST only when the arguments do resolve as the same type.
And perhaps you could convince people that BETWEEN ought to depend
on the default btree opclass not on operator names. But it's all
a lot messier than you might think.

If you're interested in making improvements in this area for core
PostgreSQL, then pgsql-hackers is the place to discuss that.

Yup.

regards, tom lane

#4Mineharu Takahara
mtakahara@yugabyte.com
In reply to: Tom Lane (#3)
Re: Suboptimal query plans for BETWEEN SYMMETRIC operations

This isn't a bug, it's just something that could perhaps be made more

optimal.

Indeed.

I agree. I should've clarified upfront that this was rather an enhancement
request. Or is there some specific way of tagging enhancement requests /
improvement suggestions that I can follow?

The trouble with the LEAST/GREATEST formulation is that it may result

in different semantics in situations where val1 and val2 aren't the
same type. Also, LEAST/GREATEST rely on the default btree opclass
for the common type, which might not match the semantics of the
comparison operators that the current coding chooses.

Great points. Thanks for the insights!

If you're interested in making improvements in this area for core
PostgreSQL, then pgsql-hackers is the place to discuss that.

Got it, thanks!

On Thu, Nov 7, 2024 at 6:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Show quoted text

David Rowley <dgrowleyml@gmail.com> writes:

On Fri, 8 Nov 2024 at 08:36, Mineharu Takahara <mtakahara@yugabyte.com>

wrote:

A condition: "col BETWEEN SYMMETRIC val1 AND val2" is currently

rewritten to "(((col >= val1) AND (col <= val2)) OR ((col >= val2) AND (col
<= val1)))" that would lead to suboptimal plans using an extra Bitmap Index
Scan or Index Scan/Index Only Scan with the entire predicate placed in the
"Filter" instead of "Index Cond".

This isn't a bug, it's just something that could perhaps be made more

optimal.

Indeed.

The trouble with the LEAST/GREATEST formulation is that it may result
in different semantics in situations where val1 and val2 aren't the
same type. Also, LEAST/GREATEST rely on the default btree opclass
for the common type, which might not match the semantics of the
comparison operators that the current coding chooses.

There are ways around that --- one could be to transform to
LEAST/GREATEST only when the arguments do resolve as the same type.
And perhaps you could convince people that BETWEEN ought to depend
on the default btree opclass not on operator names. But it's all
a lot messier than you might think.

If you're interested in making improvements in this area for core
PostgreSQL, then pgsql-hackers is the place to discuss that.

Yup.

regards, tom lane