MergeJoin fails on incomplete btree opfamily definition

Started by Andrei Lepikhov15 days ago1 messagesbugs
Jump to latest
#1Andrei Lepikhov
lepihov@gmail.com

Hi,

While experimenting with chaotic cost assignment [1]https://github.com/danolivo/pg-chaos-test, I found that
equivclass.sql contains a query that sometimes fails with an error:

-- let's try that as a mergejoin
set enable_mergejoin = on;
set enable_nestloop = off;
explain (costs off)
select * from ec1,
(select ff + 1 as x from
...

The underlying problem is simple: an incomplete btree operator family.
That would be acceptable if the behaviour were consistent, but as the
attached self-contained SQL script shows, it is unstable and depends on
which sort direction the planner happens to choose for the merge join's
internal sort — the same catalogue state can yield either a valid plan
or an elog(ERROR) depending on cost-driven decisions.

The main reason to fix this: when the opfamily lacks the ordering
operator required to sort one side of a merge, the planner should skip
the MergeJoin strategy and fall back to another join method, rather than
constructing an unusable plan and crashing in
prepare_sort_from_pathkeys(). I have verified the issue reproduces on
current master. This code is unchanged for several releases, so it seems
worth fixing and back-patching.

The issue actually has two layers. The "missing operator" error is the
first one. There is a related "missing support function" error on the
same path — I'll discover it later if there is interest in this topic.
See the proposed fix for the current problem.

[1]: https://github.com/danolivo/pg-chaos-test

--
regards, Andrei Lepikhov,
pgEdge

Attachments:

repro_layer_1.sqltext/plain; charset=UTF-8; name=repro_layer_1.sqlDownload
v0-0001-Skip-merge-join-paths-when-opfamily-lacks-orderin.patchtext/plain; charset=UTF-8; name=v0-0001-Skip-merge-join-paths-when-opfamily-lacks-orderin.patchDownload+154-1