Join removal and attr_needed cleanup
The attached patch tries to fix a corner case where attr_needed of an inner
relation of an OJ contains the join relid only because another,
already-removed OJ, needed some of its attributes. The unnecessary presence of
the join relid in attr_needed can prevent the planner from further join
removals.
Do cases like this seem worth the effort and is the logic I use correct?
--
Antonin Houska
Web: https://www.cybertec-postgresql.com
Attachments:
0001-Cleanup-attr_needed-properly-after-join-removal.patchtext/x-diffDownload+130-2
We are encountering this issue which results in poor planning for some
views.
Some examples to illustrate the issue:
-- setup
create table foo as
select id1, id2
from generate_series(1, 100) id1,
generate_series(1, 100) id2;
alter table foo add unique (id1, id2);
-- join elimination works as expected
explain (costs off)
select a.*
from foo a
left join foo b on (b.id1, b.id2) = (a.id1, a.id2)
left join foo c on (c.id1, c.id2) = (a.id1, a.id2);
-- ^^^^^^^^^^^^^^
-- QUERY PLAN
-- -------------------
-- Seq Scan on foo a
-- join elimination works as expected
explain (costs off)
select a.*
from foo a
left join foo b on (b.id1, b.id2) = (a.id1, a.id2)
left join foo c on (c.id1, c.id2) = (b.id1, b.id2);
-- ^^^^^^^^^^^^^^
-- QUERY PLAN
-- -------------------
-- Seq Scan on foo a
-- join elimination fails
-- expect both b and c to be eliminated, but b remains
explain (costs off)
select a.*
from foo a
left join foo b on (b.id1, b.id2) = (a.id1, a.id2)
left join foo c on (c.id1, c.id2) = (a.id1, b.id2);
-- ^^^^^^^^^^^^^^
-- QUERY PLAN
-- ----------------------------------------------------
-- Hash Left Join
-- Hash Cond: ((a.id1 = b.id1) AND (a.id2 = b.id2))
-- -> Seq Scan on foo a
-- -> Hash
-- -> Seq Scan on foo b
Bennie Swart <bennieswart@gmail.com> writes:
-- join elimination fails
-- expect both b and c to be eliminated, but b remains
explain (costs off)
select a.*
from foo a
left join foo b on (b.id1, b.id2) = (a.id1, a.id2)
left join foo c on (c.id1, c.id2) = (a.id1, b.id2);
-- ^^^^^^^^^^^^^^
This does work in HEAD, presumably as a consequence of a3179ab69:
regression=# explain (costs off)
select a.*
from foo a
left join foo b on (b.id1, b.id2) = (a.id1, a.id2)
left join foo c on (c.id1, c.id2) = (a.id1, b.id2);
QUERY PLAN
-------------------
Seq Scan on foo a
(1 row)
I think it's still too soon to consider back-patching that though,
since it's only been in the tree for six weeks.
regards, tom lane
Given that some time has passed I'd like to raise this again. Any chance
of this making it back to 16 to fix the regression?
We'll be using 16 for the better part of a year still and we're having
to resort to some annoying workarounds for this.
Regards,
Bennie
Show quoted text
On 2024/11/10 18:17, Tom Lane wrote:
Bennie Swart <bennieswart@gmail.com> writes:
-- join elimination fails
-- expect both b and c to be eliminated, but b remains
explain (costs off)
select a.*
from foo a
left join foo b on (b.id1, b.id2) = (a.id1, a.id2)
left join foo c on (c.id1, c.id2) = (a.id1, b.id2);
-- ^^^^^^^^^^^^^^This does work in HEAD, presumably as a consequence of a3179ab69:
regression=# explain (costs off)
select a.*
from foo a
left join foo b on (b.id1, b.id2) = (a.id1, a.id2)
left join foo c on (c.id1, c.id2) = (a.id1, b.id2);
QUERY PLAN
-------------------
Seq Scan on foo a
(1 row)I think it's still too soon to consider back-patching that though,
since it's only been in the tree for six weeks.regards, tom lane