Join removal and attr_needed cleanup

Started by Antonin Houskaalmost 2 years ago4 messageshackers
Jump to latest
#1Antonin Houska
ah@cybertec.at

The attached patch tries to fix a corner case where attr_needed of an inner
relation of an OJ contains the join relid only because another,
already-removed OJ, needed some of its attributes. The unnecessary presence of
the join relid in attr_needed can prevent the planner from further join
removals.

Do cases like this seem worth the effort and is the logic I use correct?

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

Attachments:

0001-Cleanup-attr_needed-properly-after-join-removal.patchtext/x-diffDownload+130-2
#2Bennie Swart
bennieswart@gmail.com
In reply to: Antonin Houska (#1)
Re: Join removal and attr_needed cleanup

We are encountering this issue which results in poor planning for some
views.

Some examples to illustrate the issue:

-- setup
create table foo as
  select id1, id2
    from generate_series(1, 100) id1,
         generate_series(1, 100) id2;
alter table foo add unique (id1, id2);

-- join elimination works as expected
explain (costs off)
  select a.*
    from foo a
      left join foo b on (b.id1, b.id2) = (a.id1, a.id2)
      left join foo c on (c.id1, c.id2) = (a.id1, a.id2);
                                       -- ^^^^^^^^^^^^^^
--     QUERY PLAN
-- -------------------
--  Seq Scan on foo a

-- join elimination works as expected
explain (costs off)
  select a.*
    from foo a
      left join foo b on (b.id1, b.id2) = (a.id1, a.id2)
      left join foo c on (c.id1, c.id2) = (b.id1, b.id2);
                                       -- ^^^^^^^^^^^^^^
--     QUERY PLAN
-- -------------------
--  Seq Scan on foo a

-- join elimination fails
-- expect both b and c to be eliminated, but b remains
explain (costs off)
  select a.*
    from foo a
      left join foo b on (b.id1, b.id2) = (a.id1, a.id2)
      left join foo c on (c.id1, c.id2) = (a.id1, b.id2);
                                       -- ^^^^^^^^^^^^^^
--                      QUERY PLAN
-- ----------------------------------------------------
--  Hash Left Join
--    Hash Cond: ((a.id1 = b.id1) AND (a.id2 = b.id2))
--    ->  Seq Scan on foo a
--    ->  Hash
--          ->  Seq Scan on foo b

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bennie Swart (#2)
Re: Join removal and attr_needed cleanup

Bennie Swart <bennieswart@gmail.com> writes:

-- join elimination fails
-- expect both b and c to be eliminated, but b remains
explain (costs off)
  select a.*
    from foo a
      left join foo b on (b.id1, b.id2) = (a.id1, a.id2)
      left join foo c on (c.id1, c.id2) = (a.id1, b.id2);
                                       -- ^^^^^^^^^^^^^^

This does work in HEAD, presumably as a consequence of a3179ab69:

regression=# explain (costs off)
select a.*
from foo a
left join foo b on (b.id1, b.id2) = (a.id1, a.id2)
left join foo c on (c.id1, c.id2) = (a.id1, b.id2);
QUERY PLAN
-------------------
Seq Scan on foo a
(1 row)

I think it's still too soon to consider back-patching that though,
since it's only been in the tree for six weeks.

regards, tom lane

#4Bennie Swart
bennieswart@gmail.com
In reply to: Tom Lane (#3)
Re: Join removal and attr_needed cleanup

Given that some time has passed I'd like to raise this again. Any chance
of this making it back to 16 to fix the regression?

We'll be using 16 for the better part of a year still and we're having
to resort to some annoying workarounds for this.

Regards,

Bennie

Show quoted text

On 2024/11/10 18:17, Tom Lane wrote:

Bennie Swart <bennieswart@gmail.com> writes:

-- join elimination fails
-- expect both b and c to be eliminated, but b remains
explain (costs off)
  select a.*
    from foo a
      left join foo b on (b.id1, b.id2) = (a.id1, a.id2)
      left join foo c on (c.id1, c.id2) = (a.id1, b.id2);
                                       -- ^^^^^^^^^^^^^^

This does work in HEAD, presumably as a consequence of a3179ab69:

regression=# explain (costs off)
select a.*
from foo a
left join foo b on (b.id1, b.id2) = (a.id1, a.id2)
left join foo c on (c.id1, c.id2) = (a.id1, b.id2);
QUERY PLAN
-------------------
Seq Scan on foo a
(1 row)

I think it's still too soon to consider back-patching that though,
since it's only been in the tree for six weeks.

regards, tom lane