samekeys

Started by Bruce Momjianabout 27 years ago3 messageshackers
Jump to latest
#1Bruce Momjian
bruce@momjian.us

I have modified samekeys() again. It is now:

for (key1 = keys1, key2 = keys2;
key1 != NIL && key2 != NIL;
key1 = lnext(key1), key2 = lnext(key2))
{
for (key1a = lfirst(key1), key2a = lfirst(key2);
key1a != NIL && key2a != NIL;
key1a = lnext(key1a), key2a = lnext(key2a))
if (!equal(lfirst(key1a), lfirst(key2a)))
return false;
if (key1a != NIL)
return false;
}

This basically says that key1, which is the old key, has to match key2
for the length of key1. If key2 has extra keys after that, that is
fine. We will still consider the keys equal. The old code obviously
was broken and badly thought out. One side benefit of this is that
unorder joins that are the same cost as ordered joins are discarded, in
the hopes the ordered joins can be used later on.

OTIMIZER_DEBUG now shows:

(9 8 ): size=1 width=8
path list:
MergeJoin size=1 cost=0.000000
clauses=(x7.y = x8.y)
sortouter=1 sortinner=1
SeqScan(8) size=0 cost=0.000000
SeqScan(9) size=0 cost=0.000000
MergeJoin size=1 cost=0.000000
clauses=(x7.y = x8.y)
sortouter=1 sortinner=1
SeqScan(9) size=0 cost=0.000000
SeqScan(8) size=0 cost=0.000000
cheapest path:
MergeJoin size=1 cost=0.000000
clauses=(x7.y = x8.y)
sortouter=1 sortinner=1
SeqScan(8) size=0 cost=0.000000
SeqScan(9) size=0 cost=0.000000

Which is correct. The old code had many more plans for this simple join,
perhaps 12, causing large optimizer growth with many tables.

We now have an OPTDUP_DEBUG option to show key and PathOrder duplication
testing.

I am unsure if samekeys should just test the first key for equality, or
the full length of key1 as I have done. Does the optimizer make use of
the second key of a merged RelOptInfo for subsequent joins? My guess is
that it does. For example, I keep plans that have key orders of 1,2,3
and 1,3,2. If only the first key is used in subsequent joins, I could
just keep the cheapest of the two.

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#1)
Re: [HACKERS] samekeys

Bruce Momjian <maillist@candle.pha.pa.us> writes:

This basically says that key1, which is the old key, has to match key2
for the length of key1. If key2 has extra keys after that, that is
fine. We will still consider the keys equal. The old code obviously
was broken and badly thought out.
...
I am unsure if samekeys should just test the first key for equality, or
the full length of key1 as I have done.

The comment in front of samekeys claimed:

* It isn't necessary to check that each sublist exactly contain
* the same elements because if the routine that built these
* sublists together is correct, having one element in common
* implies having all elements in common.

Was that wrong? Or, perhaps, it was once right but no longer?
It sounded like fragile coding to me, but I didn't have reason
to know it was broken...

regards, tom lane

#3Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#2)
Re: [HACKERS] samekeys

Bruce Momjian <maillist@candle.pha.pa.us> writes:

This basically says that key1, which is the old key, has to match key2
for the length of key1. If key2 has extra keys after that, that is
fine. We will still consider the keys equal. The old code obviously
was broken and badly thought out.
...
I am unsure if samekeys should just test the first key for equality, or
the full length of key1 as I have done.

The comment in front of samekeys claimed:

* It isn't necessary to check that each sublist exactly contain
* the same elements because if the routine that built these
* sublists together is correct, having one element in common
* implies having all elements in common.

Was that wrong? Or, perhaps, it was once right but no longer?
It sounded like fragile coding to me, but I didn't have reason
to know it was broken...

I think it was wrong. It clearly was not passing the right parameters.
As far as I know (1,2,3) and (3,2,1) are not the same. Their test would
just take '1' and see if it is in (3,2,1).

-- 
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026