Bug in planner

Started by Teodor Sigaevalmost 11 years ago3 messageshackers
Jump to latest
#1Teodor Sigaev
teodor@sigaev.ru

Hi!

I faced with planner error:
ERROR: could not find RelOptInfo for given relids

A test case is in attachment, all supported versions of pgsql and HEAD are
affected.

Suppose, there is something wrong in pull_up_sublinks_qual_recurse() near
converting NOT EXISTS into join, because preventing such convertion makes query
worked. But I'm not sure.

--
Teodor Sigaev E-mail: teodor@sigaev.ru
WWW: http://www.sigaev.ru/

Attachments:

bug.sqltext/plain; charset=UTF-8; name=bug.sqlDownload
#2David Rowley
dgrowleyml@gmail.com
In reply to: Teodor Sigaev (#1)
Re: Bug in planner

On 24 April 2015 at 21:43, Teodor Sigaev <teodor@sigaev.ru> wrote:

Hi!

I faced with planner error:
ERROR: could not find RelOptInfo for given relids

Good find!

I've simplified your query a bit, the following still shows the issue
(using your schema):

SELECT *
FROM t1
WHERE NOT EXISTS (SELECT t2.c2 AS c1
FROM t2
LEFT OUTER JOIN t3 ON t2.c3 = t3.c1
LEFT OUTER JOIN (SELECT t5.c1 AS c1
FROM t4
LEFT OUTER JOIN t5 ON t4.c2 = t5.c1 --offset 0
) a1 ON a1.c1 = t3.c2
WHERE t1.c1 = t2.c2
);

I've done a little debugging on this too and I get the idea that in
eqjoinsel() that min_righthand incorrectly does not have a bit set for "t3"

When the failing call is made to find_join_rel() with the above query
relids being searched for has a decimal value of 388 (binary 110000100 i.e
t5, t4, t2)

find_join_rel makes a pass over join_rel_list to search for the 388 valued
relids.

join_rel_list contains the following:

1 -> 396 (110001100) t5, t4, t3, t2
2 -> 384 (110000000) t5, t4
3 -> 392 (110001000) t5, t4, t3
4 -> 396 (110001100) t5, t4, t3, t2

I looked up simple_rte_array to determine which bits are for which relation.

simple_rte_array:
1 -> t1
2 -> t2
3 -> t3
4 -> join
5 -> a1
6 -> join
7 -> t4
8 -> t5

I'd imagine that the find_join_input_rel() search should actually be for
relids 396. I need to spend more time in this area to get a better grasp of
what's actually meant to be happening, but I think the problem lies
in make_outerjoininfo() when it determines what min_righthand should be set
to with the following:

/*
* Similarly for required RHS. But here, we must also include any lower
* inner joins, to ensure we don't try to commute with any of them.
*/
min_righthand = bms_int_members(bms_union(clause_relids, inner_join_rels),
right_rels);

I think the problem seems to be down to the fact that inner_join_rels and
clause_relids are built from deconstruct_jointree() which I'd imagine does
not get modified when the subquery for t4 and t5 is pulled up, therefore is
out-of-date. </theory>

I've attached a patch which appears to fix the problem, but this is more
for the purposes of a demonstration of where the problem lies. I don't have
enough knowledge of what's meant to be happening here, I'll need to spend
more time reading code and debugging.

On a side note, I just discovered another join removal opportunity:

explain select * from t1 where not exists(select 1 from t2 left join t3 on
t2.c1 = t3.c1 where t1.c1=t2.c1);

The join to t3 here is useless, as since it's a left join, the join could
only cause duplication of t2 rows, and this does not matter as we're
performing an anti join anyway (same applies for semi join).

Regards

David Rowley

Attachments:

anti_join_with_pulledup_outer_joins_fix_attempt.patchapplication/octet-stream; name=anti_join_with_pulledup_outer_joins_fix_attempt.patchDownload+1-1
#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: David Rowley (#2)
Re: Bug in planner

David Rowley <dgrowleyml@gmail.com> writes:

On 24 April 2015 at 21:43, Teodor Sigaev <teodor@sigaev.ru> wrote:

I faced with planner error:
ERROR: could not find RelOptInfo for given relids

I've done a little debugging on this too and I get the idea that in
eqjoinsel() that min_righthand incorrectly does not have a bit set for "t3"

Yeah. The short of it seems to be that initsplan.c is too optimistic
about whether antijoins can be reordered against outer joins in their RHS.
The discussion in optimizer/README says pretty clearly that they can't
(and eqjoinsel is relying on that, per the comment therein), so I think
this is basically brain fade in translating that logic to code.

regards, tom lane

Attachments:

antijoin-fix.patchtext/x-diff; charset=us-ascii; name=antijoin-fix.patchDownload+66-6