pgsql: Further fixes for degenerate outer join clauses.

Started by Tom Laneover 10 years ago9 messages

tgl@sss.pgh.pa.us

over 10 years ago

Further fixes for degenerate outer join clauses.

Further testing revealed that commit f69b4b9495269cc4 was still a few
bricks shy of a load: minor tweaking of the previous test cases resulted
in the same wrong-outer-join-order problem coming back. After study
I concluded that my previous changes in make_outerjoininfo() were just
accidentally masking the problem, and should be reverted in favor of
forcing syntactic join order whenever an upper outer join's predicate
doesn't mention a lower outer join's LHS. This still allows the
chained-outer-joins style that is the normally optimizable case.

I also tightened things up some more in join_is_legal(). It seems to me
on review that what's really happening in the exception case where we
ignore a mismatched special join is that we're allowing the proposed join
to associate into the RHS of the outer join we're comparing it to. As
such, we should *always* insist that the proposed join be a left join,
which eliminates a bunch of rather dubious argumentation. The case where
we weren't enforcing that was the one that was already known buggy anyway
(it had a violatable Assert before the aforesaid commit) so it hardly
deserves a lot of deference.

Back-patch to all active branches, like the previous patch. The added
regression test case failed in all branches back to 9.1, and I think it's
only an unrelated change in costing calculations that kept 9.0 from
choosing a broken plan.

Branch
------
REL9_1_STABLE

Details
-------
http://git.postgresql.org/pg/commitdiff/656b1e8cf358990b7700448d3b9e85202105cde0

Modified Files
--------------
src/backend/optimizer/README | 23 +++++++---
src/backend/optimizer/path/joinrels.c | 73 ++++++++++++--------------------
src/backend/optimizer/plan/initsplan.c | 28 ++++++------
src/test/regress/expected/join.out | 72 ++++++++++++++++++++++++++++++-
src/test/regress/sql/join.sql | 25 +++++++++++
5 files changed, 153 insertions(+), 68 deletions(-)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

Andrew Dunstan

andrew@dunslane.net

over 10 years ago

In reply to: Tom Lane (#1)

Re: pgsql: Further fixes for degenerate outer join clauses.

On 08/06/2015 03:36 PM, Tom Lane wrote:

Further fixes for degenerate outer join clauses.

Further testing revealed that commit f69b4b9495269cc4 was still a few
bricks shy of a load: minor tweaking of the previous test cases resulted
in the same wrong-outer-join-order problem coming back. After study
I concluded that my previous changes in make_outerjoininfo() were just
accidentally masking the problem, and should be reverted in favor of
forcing syntactic join order whenever an upper outer join's predicate
doesn't mention a lower outer join's LHS. This still allows the
chained-outer-joins style that is the normally optimizable case.

I also tightened things up some more in join_is_legal(). It seems to me
on review that what's really happening in the exception case where we
ignore a mismatched special join is that we're allowing the proposed join
to associate into the RHS of the outer join we're comparing it to. As
such, we should *always* insist that the proposed join be a left join,
which eliminates a bunch of rather dubious argumentation. The case where
we weren't enforcing that was the one that was already known buggy anyway
(it had a violatable Assert before the aforesaid commit) so it hardly
deserves a lot of deference.

Back-patch to all active branches, like the previous patch. The added
regression test case failed in all branches back to 9.1, and I think it's
only an unrelated change in costing calculations that kept 9.0 from
choosing a broken plan.

Looks like this might have upset brolga on 9.0 and 9.1 - it's coming up
with a different plan from what's expected.

cheers

andrew

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

Tom Lane

tgl@sss.pgh.pa.us

over 10 years ago

In reply to: Andrew Dunstan (#2)

Re: pgsql: Further fixes for degenerate outer join clauses.

Andrew Dunstan <andrew@dunslane.net> writes:

On 08/06/2015 03:36 PM, Tom Lane wrote:

Further fixes for degenerate outer join clauses.

Looks like this might have upset brolga on 9.0 and 9.1 - it's coming up
with a different plan from what's expected.

Yeah, I saw that, but haven't had time yet to investigate what's up.
Odd that it's just the one critter though ...

regards, tom lane

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

Tom Lane

tgl@sss.pgh.pa.us

over 10 years ago

In reply to: Andrew Dunstan (#2)

Re: [COMMITTERS] pgsql: Further fixes for degenerate outer join clauses.

Andrew Dunstan <andrew@dunslane.net> writes:

On 08/06/2015 03:36 PM, Tom Lane wrote:

Further fixes for degenerate outer join clauses.

Looks like this might have upset brolga on 9.0 and 9.1 - it's coming up
with a different plan from what's expected.

I looked into this, and while I can't be certain of the diagnosis without
looking at the assembly code brolga generates (no, I don't want to), I'm
pretty sure I know what's going on here. The two plans in question have
exactly the same estimated costs on my machine, which results in add_path
discarding the second one to arrive. Evidently, on brolga the second path
gets some fractionally cheaper estimate due to some compiler-specific
variation in the way the arithmetic is done, so it gets kept rather than
the first one.

In 9.2 and up we got rid of that class of problems with this patch:

Author: Tom Lane <tgl@sss.pgh.pa.us>
Branch: master Release: REL9_2_BR [33e99153e] 2012-04-21 00:51:14 -0400

Use fuzzy not exact cost comparison for the final tie-breaker in add_path.

Instead of an exact cost comparison, use a fuzzy comparison with 1e-10
delta after all other path metrics have proved equal. This is to avoid
having platform-specific roundoff behaviors determine the choice when
two paths are really the same to our cost estimators. Adjust the
recently-added test case that made it obvious we had a problem here.

but 9.0 and 9.1 still have platform-dependent treatment of paths with
essentially equal cost estimates.

I'm not entirely sure what to do about this. We could back-patch that
patch into 9.0 and 9.1, but it's conceivable somebody would squawk about
planner behavioral changes. The only other idea that seems practical is
to remove regression test cases that have platform-specific results in
those branches. Probably that wouldn't result in a real reduction in the
quality of the test coverage for those branches (we could still execute
the query, just not EXPLAIN it). But it seems like a pretty ad-hoc
answer, and the next case might be one that hurts more not to test.

Thoughts?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Stephen Frost

sfrost@snowman.net

over 10 years ago

In reply to: Tom Lane (#4)

Re: [COMMITTERS] pgsql: Further fixes for degenerate outer join clauses.

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

I'm not entirely sure what to do about this. We could back-patch that
patch into 9.0 and 9.1, but it's conceivable somebody would squawk about
planner behavioral changes. The only other idea that seems practical is
to remove regression test cases that have platform-specific results in
those branches. Probably that wouldn't result in a real reduction in the
quality of the test coverage for those branches (we could still execute
the query, just not EXPLAIN it). But it seems like a pretty ad-hoc
answer, and the next case might be one that hurts more not to test.

Thoughts?

Have an alternate file for those other cases, rather than remove the
test? The complaint was about one buildfarm member, so hopefully that's
practical and doesn't require a lot of different permutations.

Thanks!

Stephen

Tom Lane

tgl@sss.pgh.pa.us

over 10 years ago

In reply to: Stephen Frost (#5)

Re: [COMMITTERS] pgsql: Further fixes for degenerate outer join clauses.

Stephen Frost <sfrost@snowman.net> writes:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

I'm not entirely sure what to do about this. We could back-patch that
patch into 9.0 and 9.1, but it's conceivable somebody would squawk about
planner behavioral changes. The only other idea that seems practical is
to remove regression test cases that have platform-specific results in
those branches. Probably that wouldn't result in a real reduction in the
quality of the test coverage for those branches (we could still execute
the query, just not EXPLAIN it). But it seems like a pretty ad-hoc
answer, and the next case might be one that hurts more not to test.

Thoughts?

Have an alternate file for those other cases, rather than remove the
test? The complaint was about one buildfarm member, so hopefully that's
practical and doesn't require a lot of different permutations.

I considered that but don't find it practical or attractive, especially
not if the only way to keep such a file updated is to wait and see whether
the buildfarm complains.

On the whole I'm leaning towards back-patching 33e99153e. While the case
of exactly equal plan costs does come up in the regression tests (which
tend to inspect plans for queries on small simple tables), I think it's
relatively unlikely to happen with real-world data.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Stephen Frost

sfrost@snowman.net

over 10 years ago

In reply to: Tom Lane (#6)

Re: [COMMITTERS] pgsql: Further fixes for degenerate outer join clauses.

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Stephen Frost <sfrost@snowman.net> writes:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

I'm not entirely sure what to do about this. We could back-patch that
patch into 9.0 and 9.1, but it's conceivable somebody would squawk about
planner behavioral changes. The only other idea that seems practical is
to remove regression test cases that have platform-specific results in
those branches. Probably that wouldn't result in a real reduction in the
quality of the test coverage for those branches (we could still execute
the query, just not EXPLAIN it). But it seems like a pretty ad-hoc
answer, and the next case might be one that hurts more not to test.

Thoughts?

Have an alternate file for those other cases, rather than remove the
test? The complaint was about one buildfarm member, so hopefully that's
practical and doesn't require a lot of different permutations.

I considered that but don't find it practical or attractive, especially
not if the only way to keep such a file updated is to wait and see whether
the buildfarm complains.

I agree, that's a bit unfortunate, but it strikes me as pretty unlikely
that we're ever going to change those tests or that a code change would
end up causing yet another different plan before 9.1 is completely out
of support in the next couple years.

On the whole I'm leaning towards back-patching 33e99153e. While the case
of exactly equal plan costs does come up in the regression tests (which
tend to inspect plans for queries on small simple tables), I think it's
relatively unlikely to happen with real-world data.

I agree it's unlikely, but I don't particularly like changing our mind
on a back-patching decision 3 years later to satisfy our regression
tests.

Still, I don't feel particularly strongly about either side of this, so
I'm happy with you making the decision.

Thanks!

Stephen

Tom Lane

tgl@sss.pgh.pa.us

over 10 years ago

In reply to: Stephen Frost (#7)

Re: [COMMITTERS] pgsql: Further fixes for degenerate outer join clauses.

Stephen Frost <sfrost@snowman.net> writes:

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Stephen Frost <sfrost@snowman.net> writes:

Have an alternate file for those other cases, rather than remove the
test? The complaint was about one buildfarm member, so hopefully that's
practical and doesn't require a lot of different permutations.

I considered that but don't find it practical or attractive, especially
not if the only way to keep such a file updated is to wait and see whether
the buildfarm complains.

I agree, that's a bit unfortunate, but it strikes me as pretty unlikely
that we're ever going to change those tests or that a code change would
end up causing yet another different plan before 9.1 is completely out
of support in the next couple years.

Meh. 9.0 will be dead in a month or two, but 9.1 still has a ways to go,
and I don't really want to be dealing with this issue every time Andreas'
fuzz testing finds another corner case :-(. It was hard enough devising
regression tests that used stable table data for those cases as it was.

Frequently, when building a planner regression test, it's important that
a particular plan shape be selected (eg because what you're trying to test
is that setrefs.c postprocessing does the right thing with a given case).
So if we take supporting brolga's behavior as-is as a requirement, it's
not always going to be good enough to just maintain an alternate output
file; we'd have to try to adjust the test query to make it do the right
thing. It's bad enough dealing with 32/64bit and endianness dependencies
for that, I don't want minute compiler codegen choices entering into it
as well.

On the whole I'm leaning towards back-patching 33e99153e. While the case
of exactly equal plan costs does come up in the regression tests (which
tend to inspect plans for queries on small simple tables), I think it's
relatively unlikely to happen with real-world data.

I agree it's unlikely, but I don't particularly like changing our mind
on a back-patching decision 3 years later to satisfy our regression
tests.

Actually, I'd take that the other way around: now that the patch has been
out for awhile, and we've not heard squawks that could be traced to it,
there's a much better argument that back-patching would be safe than there
was at the time.

There is certainly some risk in making changes that are only for ease of
maintenance and don't fix a provable bug ... but it's not like the other
changes I've committed into 9.0 and 9.1 lately don't change any
corner-case planner behaviors. Besides, you could argue that it is a bug
if rebuilding with a different compiler can change the planner's behavior.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Stephen Frost

sfrost@snowman.net

over 10 years ago

In reply to: Tom Lane (#8)

Re: [COMMITTERS] pgsql: Further fixes for degenerate outer join clauses.

* Tom Lane (tgl@sss.pgh.pa.us) wrote:

Stephen Frost <sfrost@snowman.net> writes:

I agree, that's a bit unfortunate, but it strikes me as pretty unlikely
that we're ever going to change those tests or that a code change would
end up causing yet another different plan before 9.1 is completely out
of support in the next couple years.

Meh. 9.0 will be dead in a month or two, but 9.1 still has a ways to go,
and I don't really want to be dealing with this issue every time Andreas'
fuzz testing finds another corner case :-(. It was hard enough devising
regression tests that used stable table data for those cases as it was.

Frequently, when building a planner regression test, it's important that
a particular plan shape be selected (eg because what you're trying to test
is that setrefs.c postprocessing does the right thing with a given case).
So if we take supporting brolga's behavior as-is as a requirement, it's
not always going to be good enough to just maintain an alternate output
file; we'd have to try to adjust the test query to make it do the right
thing. It's bad enough dealing with 32/64bit and endianness dependencies
for that, I don't want minute compiler codegen choices entering into it
as well.

Certainly a good point.

On the whole I'm leaning towards back-patching 33e99153e. While the case
of exactly equal plan costs does come up in the regression tests (which
tend to inspect plans for queries on small simple tables), I think it's
relatively unlikely to happen with real-world data.

I agree it's unlikely, but I don't particularly like changing our mind
on a back-patching decision 3 years later to satisfy our regression
tests.

Actually, I'd take that the other way around: now that the patch has been
out for awhile, and we've not heard squawks that could be traced to it,
there's a much better argument that back-patching would be safe than there
was at the time.

I definitely agree with that when it comes to bug cases, but I'm a bit
more wary around planner changes. Still, I don't recall any specific
complaints about plan changes from pre-9.2 to 9.2+ in such corner cases
and I do agree with you that real data would make this far less likely
to happen.

There is certainly some risk in making changes that are only for ease of
maintenance and don't fix a provable bug ... but it's not like the other
changes I've committed into 9.0 and 9.1 lately don't change any
corner-case planner behaviors. Besides, you could argue that it is a bug
if rebuilding with a different compiler can change the planner's behavior.

Good point.

Thanks!

Stephen