Avoiding planning redundant backwards merges

Started by Tom Laneover 18 years ago3 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

While fooling around with the planner performance bug reported here:
http://archives.postgresql.org/pgsql-bugs/2007-10/msg00173.php
I noticed that even after fixing the bug, 8.3 seemed to plan this
many-way join about 50% slower than 8.2 :-(. After some digging
I understand the reason: 8.3 will almost always consider both a
forward and a backward indexscan on each potentially useful index.
This results in twice as many pre-sorted paths available to
match_unsorted_outer(), and therefore about twice as much work
generating paths that are exactly equivalent cost-wise but generate
opposite output orders.

These extra paths are not without use, now that we have the ability to
declare DESC index columns: the only way to produce an indexed mergejoin
between an ASC and a DESC index is to scan one of them backwards. So I
don't want to lobotomize the code entirely. But we're paying a pretty
high price for that capability.

The idea I'm toying with is to make pathkeys_useful_for_merging()
consider only ASC pathkeys as useful for merging --- that is, only
pathkeys with pk_strategy = BTLessStrategyNumber. This would mean that
only forward scans on ASC indexes and backward scans on DESC indexes
would be considered to have "interesting" sort orders, and therefore
in cases without any ORDER BY clause to worry about, the other indexscan
path would not survive the initial competition in add_path. It'd be
seen as having the same cost and worse ordering, and would be dropped.

Now the tricky point in this is that if there's an ORDER BY on the
query, then you might want a "backwards" mergejoin after all, if
that means you come out with the right ordering for the ORDER BY.
However, if there is such an ORDER BY, then pathkeys_useful_for_ordering
will pick up on it and the opposite-direction indexscan will survive
after all. (The number of pathkeys we keep for a path is the larger
of these two functions' estimates, and paths with unequal pathkeys
do not compete in add_path.) We'll plan out all the alternatives
from both starting points, and eventually figure out at the top level
that one of them avoids a final sort and is therefore cheaper. So
in this not-too-common case, we'll be slower than 8.2 but also produce
a better plan; I don't feel too bad about that.

I admit this seems a bit Rube Goldbergian, but then again there is a
whole lot of the planner's behavior that is emergent from the interplay
of little pieces rather than being explicitly coded in one place.

Comments, better ideas?

regards, tom lane

#2Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#1)
Re: Avoiding planning redundant backwards merges

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

The idea I'm toying with is to make pathkeys_useful_for_merging()
consider only ASC pathkeys as useful for merging --- that is, only
pathkeys with pk_strategy = BTLessStrategyNumber. This would mean that
only forward scans on ASC indexes and backward scans on DESC indexes
would be considered to have "interesting" sort orders, and therefore
in cases without any ORDER BY clause to worry about, the other indexscan
path would not survive the initial competition in add_path. It'd be
seen as having the same cost and worse ordering, and would be dropped.

So the case that wouldn't be covered would be if you have a descending index
on one table and an ascending index on another table and try to merge join
them?

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#2)
Re: Avoiding planning redundant backwards merges

Gregory Stark <stark@enterprisedb.com> writes:

"Tom Lane" <tgl@sss.pgh.pa.us> writes:

The idea I'm toying with is to make pathkeys_useful_for_merging()
consider only ASC pathkeys as useful for merging --- that is, only
pathkeys with pk_strategy = BTLessStrategyNumber.

So the case that wouldn't be covered would be if you have a descending index
on one table and an ascending index on another table and try to merge join
them?

No, that still works. The point is that an ascending pathkey would come
from a forward scan on the regular index and a backward scan on the
descending index. Only those two paths, not their siblings
backward-scan-on-regular and forward-scan-on-descending, would survive
the initial planning round, and so we'd only consider merging in that
direction and not the other. However, in the event that the query asks
for ORDER BY merge-key DESC, we reverse all that so that only the other
two paths survive.

regards, tom lane