Converting SetOp to read its two inputs separately

Started by Tom Laneover 1 year ago17 messageshackers

tgl@sss.pgh.pa.us

over 1 year ago

Here are the beginnings of a patchset to do what was discussed at [1]/messages/by-id/2631313.1730733484@sss.pgh.pa.us,
namely change the SetOp node type to read its inputs as outerPlan and
innerPlan, rather than appending them together with a flag column to
show which rows came from where.

The previous thread wondered why manually DISTINCT'ing inputs with a
lot of duplicates would make an INTERSECT faster than not doing so.
I speculated that the overhead of attaching the flag column (which
typically requires an additional projection node) plus the overhead of
an Append node might have a lot to do with that, and it seems I was
right. With this patchset on yesterday's HEAD, I get results like
these for a simple test case:

regression=# create table t1 as
regression-# select (random()*100)::int as a from generate_series(1,1000000);
SELECT 1000000
regression=# vacuum analyze t1;
VACUUM
regression=# set max_parallel_workers_per_gather TO 0;
SET
regression=# explain analyze select a from t1 intersect select a from t1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
HashSetOp Intersect (cost=0.00..33960.00 rows=101 width=4) (actual time=246.352..246.355 rows=101 loops=1)
-> Seq Scan on t1 (cost=0.00..14480.00 rows=1000000 width=4) (actual time=0.025..35.646 rows=1000000 loops=1)
-> Seq Scan on t1 t1_1 (cost=0.00..14480.00 rows=1000000 width=4) (actual time=0.011..36.163 rows=1000000 loops=1)
Planning Time: 0.045 ms
Execution Time: 246.372 ms
(5 rows)

regression=# explain analyze select distinct a from t1 intersect select distinct a from t1;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
HashSetOp Intersect (cost=33960.00..33962.52 rows=101 width=4) (actual time=238.888..238.891 rows=101 loops=1)
-> HashAggregate (cost=16980.00..16981.01 rows=101 width=4) (actual time=120.749..120.756 rows=101 loops=1)
Group Key: t1.a
Batches: 1 Memory Usage: 24kB
-> Seq Scan on t1 (cost=0.00..14480.00 rows=1000000 width=4) (actual time=0.027..35.391 rows=1000000 loops=1)
-> HashAggregate (cost=16980.00..16981.01 rows=101 width=4) (actual time=118.101..118.107 rows=101 loops=1)
Group Key: t1_1.a
Batches: 1 Memory Usage: 24kB
-> Seq Scan on t1 t1_1 (cost=0.00..14480.00 rows=1000000 width=4) (actual time=0.014..35.468 rows=1000000 loops=1)
Planning Time: 0.043 ms
Execution Time: 238.916 ms
(11 rows)

It's still a little slower without DISTINCT, but not 50% slower like
before. I have hopes that it'll be nearly on par once I figure out
how to avoid the extra per-row slot type conversion that's being done
in the 0001 patch.

Aside from that minor TODO, the main thing that's left undone in this
patch series is to persuade the thing to exploit presorted input
paths. Right now it fails to do so, as can be seen in some of the
regression test cases, eg

regression=# set enable_hashagg = 0;
SET
regression=# explain (costs off) select unique1 from tenk1 except select unique2 from tenk1 where unique2 != 10;
QUERY PLAN
------------------------------------------------------------------
SetOp Except
-> Sort
Sort Key: tenk1.unique1
-> Index Only Scan using tenk1_unique1 on tenk1
-> Sort
Sort Key: tenk1_1.unique2
-> Index Only Scan using tenk1_unique2 on tenk1 tenk1_1
Filter: (unique2 <> 10)
(8 rows)

Obviously the sorts are unnecessary, but it's not seeing that.
I suppose this requires integrating generate_nonunion_paths with
the logic from commit 66c0185a3. I tried to make sense of that,
but failed --- either I don't understand it, or there are a
number of things wrong with it. I'd welcome some help with
getting that done.

regards, tom lane

[1]: /messages/by-id/2631313.1730733484@sss.pgh.pa.us

Richard Guo

guofenglinux@gmail.com

over 1 year ago

In reply to: Tom Lane (#1)

Re: Converting SetOp to read its two inputs separately

On Thu, Nov 14, 2024 at 11:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Aside from that minor TODO, the main thing that's left undone in this
patch series is to persuade the thing to exploit presorted input
paths. Right now it fails to do so, as can be seen in some of the
regression test cases, eg

regression=# set enable_hashagg = 0;
SET
regression=# explain (costs off) select unique1 from tenk1 except select unique2 from tenk1 where unique2 != 10;
QUERY PLAN
------------------------------------------------------------------
SetOp Except
-> Sort
Sort Key: tenk1.unique1
-> Index Only Scan using tenk1_unique1 on tenk1
-> Sort
Sort Key: tenk1_1.unique2
-> Index Only Scan using tenk1_unique2 on tenk1 tenk1_1
Filter: (unique2 <> 10)
(8 rows)

Obviously the sorts are unnecessary, but it's not seeing that.
I suppose this requires integrating generate_nonunion_paths with
the logic from commit 66c0185a3. I tried to make sense of that,
but failed --- either I don't understand it, or there are a
number of things wrong with it. I'd welcome some help with
getting that done.

I think we may need to do the following to make this work:

1. We need to teach set_operation_ordered_results_useful() that sorted
input paths are also useful for INTERSECT/EXCEPT, so that we can have
setop_pathkeys set for the subqueries.

2. In generate_nonunion_paths(), we need to provide a valid
"interesting_pathkeys" when calling build_setop_child_paths().

Thanks
Richard

Richard Guo

guofenglinux@gmail.com

over 1 year ago

In reply to: Richard Guo (#2)

Re: Converting SetOp to read its two inputs separately

On Thu, Nov 14, 2024 at 6:28 PM Richard Guo <guofenglinux@gmail.com> wrote:

1. We need to teach set_operation_ordered_results_useful() that sorted
input paths are also useful for INTERSECT/EXCEPT, so that we can have
setop_pathkeys set for the subqueries.

BTW, I noticed a typo in the comment for function
set_operation_ordered_results_useful().

  * set_operation_ordered_results_useful
  *     Return true if the given SetOperationStmt can be executed by utilizing
  *     paths that provide sorted input according to the setop's targetlist.
- *     Returns false when sorted paths are not any more useful then unsorted
+ *     Returns false when sorted paths are not any more useful than unsorted
  *     ones.

Thanks
Richard

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Richard Guo (#2)

Re: Converting SetOp to read its two inputs separately

Richard Guo <guofenglinux@gmail.com> writes:

On Thu, Nov 14, 2024 at 11:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Aside from that minor TODO, the main thing that's left undone in this
patch series is to persuade the thing to exploit presorted input
paths.

I think we may need to do the following to make this work:
1. We need to teach set_operation_ordered_results_useful() that sorted
input paths are also useful for INTERSECT/EXCEPT, so that we can have
setop_pathkeys set for the subqueries.
2. In generate_nonunion_paths(), we need to provide a valid
"interesting_pathkeys" when calling build_setop_child_paths().

Once I'd wrapped my head around how things are done now (which the
comments in prepunion.c were remarkably unhelpful about), I saw that
most of the problem for #2 just requires re-ordering things that
generate_nonunion_paths was already doing. As for #1, I have a modest
proposal: we should get rid of set_operation_ordered_results_useful
entirely. It's not the code that actually does useful work, and
keeping it in sync with the code that does do useful work is hard and
unnecessary.

0001-0003 below are the same as before (so the slot-munging TODO is
still there). 0004 fixes a rather basic bug for nested set-operations
and gets rid of set_operation_ordered_results_useful along the way.
Then 0005 does your step 2.

regards, tom lane

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Richard Guo (#3)

Re: Converting SetOp to read its two inputs separately

On Thu, 14 Nov 2024 at 22:33, Richard Guo <guofenglinux@gmail.com> wrote:

BTW, I noticed a typo in the comment for function

- *     Returns false when sorted paths are not any more useful then unsorted
+ *     Returns false when sorted paths are not any more useful than unsorted

I pushed a fix for that.

Thanks.

David

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Tom Lane (#4)

Re: Converting SetOp to read its two inputs separately

On Wed, 20 Nov 2024 at 15:09, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Once I'd wrapped my head around how things are done now (which the
comments in prepunion.c were remarkably unhelpful about), I saw that
most of the problem for #2 just requires re-ordering things that
generate_nonunion_paths was already doing. As for #1, I have a modest
proposal: we should get rid of set_operation_ordered_results_useful
entirely. It's not the code that actually does useful work, and
keeping it in sync with the code that does do useful work is hard and
unnecessary.

0001-0003 below are the same as before (so the slot-munging TODO is
still there). 0004 fixes a rather basic bug for nested set-operations
and gets rid of set_operation_ordered_results_useful along the way.
Then 0005 does your step 2.

Here's a quick review of all 5 patches together.

1. In setop_load_group(), the primary concern with the result of
setop_compare_slots() seems to be if the tuples match or not. If
you're not too concerned with keeping the Assert checking for
mis-sorts, then it could be much more efficient for many cases to
check the final column first. ExecBuildGroupingEqual() makes use of
this knowledge already, so it's nothing new. Maybe you could just use
an ExprState made by ExecBuildGroupingEqual() instead of
setop_compare_slots() for this case. That would allow JIT to work.

2. (related to #1) I'm unsure if it's also worth deforming all the
needed attributes in one go rather than calling slot_getattr() each
time, which will result in incrementally deforming the tuple.
ExecBuildGroupingEqual() also does that.

3.
/* XXX hack: force the tuple into minimal form */
/* XXX should not have to do this */
ExecForceStoreHeapTuple(ExecCopySlotHeapTuple(innerslot),
setopstate->ps.ps_ResultTupleSlot, true);
innerslot = setopstate->ps.ps_ResultTupleSlot;

nodeAgg.c seems to do this by using prepare_hash_slot() which deforms
the heap tuple and sets the Datums verbatim rather than making copies
of any byref ones.

4. Since the EXPLAIN output is going to change for SetOps, what are
your thoughts on being more explicit about the setop strategy being
used? Showing "SetOp Except" isn't that informative. You've got to
look at the non-text format to know it's using the merge method. How
about "MergeSetOp Except"?

5. You might not be too worried, but in SetOpState if you move
need_init up to below setop_done, you'll close a 7-byte hole in that
struct

I'm also happy if you just push the 0004 patch. Are you planning to
backpatch that? If I'd realised you were proposing to remove that
function, I'd not have fixed the typo Richard mentioned.

David

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: David Rowley (#6)

Re: Converting SetOp to read its two inputs separately

David Rowley <dgrowleyml@gmail.com> writes:

Here's a quick review of all 5 patches together.

Thanks for looking at it!

1. In setop_load_group(), the primary concern with the result of
setop_compare_slots() seems to be if the tuples match or not. If
you're not too concerned with keeping the Assert checking for
mis-sorts, then it could be much more efficient for many cases to
check the final column first.

Hm. setop_compare_slots() has to deliver a 3-way compare result for
its usage in setop_retrieve_sorted(), where we're actually doing
a merge. We could potentially use a different subroutine within
setop_load_group(), but I'm not sure it's worth added complexity.

2. (related to #1) I'm unsure if it's also worth deforming all the
needed attributes in one go rather than calling slot_getattr() each
time, which will result in incrementally deforming the tuple.

Good point; probably worth doing given that the low-order
column is often going to determine the result.

/* XXX hack: force the tuple into minimal form */
/* XXX should not have to do this */

nodeAgg.c seems to do this by using prepare_hash_slot() which deforms
the heap tuple and sets the Datums verbatim rather than making copies
of any byref ones.

I'll take a look at that, thanks for the pointer.

4. Since the EXPLAIN output is going to change for SetOps, what are
your thoughts on being more explicit about the setop strategy being
used? Showing "SetOp Except" isn't that informative. You've got to
look at the non-text format to know it's using the merge method. How
about "MergeSetOp Except"?

I'm confused here. explain.c already uses "SetOp" versus "HashSetOp"
to show the strategy. Are you saying we should instead print
"MergeSetOp" and "HashSetOp"? That seems like change for the sake
of change, really.

I'm also happy if you just push the 0004 patch. Are you planning to
backpatch that?

I've been debating whether to propose a back-patch of that. It's
clearly a bug fix in the sense that the current code isn't doing
what was intended. On the other hand, it isn't (I think) resulting
in any wrong answers, just plans that are a bit less efficient
than they could be, and perhaps some planning time wasted on making
Paths that can't be used. And generally we refrain from making
unnecessary changes of plan choices in released branches, because
users value plan stability. So on the whole I'm leaning towards
not back-patching, but I could be persuaded otherwise if anyone
feels strongly that we should.

regards, tom lane

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Tom Lane (#7)

Re: Converting SetOp to read its two inputs separately

I wrote:

David Rowley <dgrowleyml@gmail.com> writes:

nodeAgg.c seems to do this by using prepare_hash_slot() which deforms
the heap tuple and sets the Datums verbatim rather than making copies
of any byref ones.

I'll take a look at that, thanks for the pointer.

So after digging into it, I realized that I'd been completely misled
by setop_fill_hash_table's not failing while reading the first input.
That's not because the code was correct, it was because none of our
test cases manage to reach TupleHashTableMatch() while reading the
first input. Apparently there are no regression test cases where
tuples of the first input have identical hash codes. But as soon
as I tried a case with duplicate rows in the first input, kaboom!

The short answer therefore is that LookupTupleHashEntry() absolutely
requires a MinimalTuple slot as input, and there is not some magic
code path whereby it doesn't. So I put in some code similar to
prepare_hash_slot() to do that as efficiently as can be managed.

I also switched to using slot_getallattrs() per your suggestion.
Other than that and rebasing, this is the same as v2.

regards, tom lane

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Tom Lane (#8)

Re: Converting SetOp to read its two inputs separately

I wrote:

So after digging into it, I realized that I'd been completely misled
by setop_fill_hash_table's not failing while reading the first input.
That's not because the code was correct, it was because none of our
test cases manage to reach TupleHashTableMatch() while reading the
first input. Apparently there are no regression test cases where
tuples of the first input have identical hash codes. But as soon
as I tried a case with duplicate rows in the first input, kaboom!

I didn't quite believe that theory, and so I went back to look again.
Indeed we do have test cases exercising duplicate-row removal, so
why wasn't TupleHashTableMatch crashing? The answer turns out to
be less about duplicates (although we need at least a hash-code
match to reach the problem) and more about the form of the input.
This test case from union.sql doesn't crash (with my v2 patch):

SELECT q2 FROM int8_tbl INTERSECT SELECT q1 FROM int8_tbl ORDER BY 1;

but this does:

SELECT * FROM int8_tbl INTERSECT SELECT * FROM int8_tbl ORDER BY 1;

The reason is that "SELECT *" doesn't have to do a projection,
so the SeqScan node just returns its scan tuple slot, which is
a BufferHeapTuple slot, and that triggers the wrong-slot-type
crash. But "SELECT q2" does have to do a projection, so what
it returns is a Virtual tuple slot, and that's okay! That's
not so much about restrictions of LookupTupleHashEntry as it
is about this optimization in ExecComputeSlotInfo:

/* if the slot is known to always virtual we never need to deform */
if (op->d.fetch.fixed && op->d.fetch.kind == &TTSOpsVirtual)
return false;

That causes us not to emit the EEOP_INNER_FETCHSOME opcode that
is spitting up in the BufferHeapTuple case. And apparently all
the rest of LookupTupleHashEntry is good with a virtual slot.

I can't avoid the feeling that this has all been optimized a little
too far, because it's darn near impossible to understand what has
gone wrong when something goes wrong.

Anyway, it's certainly unsafe for nodeSetOp.c to assume that its
two inputs will return slots of identical types. Since
LookupTupleHashEntry does call this tuple-comparison code that is
happy to wire in assumptions about what the input slot type is,
we had better insert a buffering slot that all the data goes through,
as the v3 patch does.

I'm inclined to add a couple of test cases similar to

SELECT * FROM int8_tbl INTERSECT SELECT q2, q1 FROM int8_tbl ORDER BY 1,2;

so that we get some coverage of cases where the input slots
are dissimilar.

regards, tom lane

#10

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Tom Lane (#9)

Re: Converting SetOp to read its two inputs separately

With David's recent fixes to allow telling BuildTupleHashTableExt
what input slot type to expect, it's possible to remove the per-row
slot type conversions I was doing before. So here's an updated
patchset with that done.

The common_result_slot_type() function I wrote here perhaps
should be made generally available, but I didn't do that yet.

0002-0005 are the same as before.

regards, tom lane

#11

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Tom Lane (#10)

Re: Converting SetOp to read its two inputs separately

On Thu, 19 Dec 2024 at 15:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:

With David's recent fixes to allow telling BuildTupleHashTableExt
what input slot type to expect, it's possible to remove the per-row
slot type conversions I was doing before. So here's an updated
patchset with that done.

The common_result_slot_type() function I wrote here perhaps
should be made generally available, but I didn't do that yet.

I think it would be good to make this generic as it can be at least
used in nodeRecursiveunion.c and nodeAppend.c. For the Append usage,
I wonder if it's worth making the function accept an array of
PlanStates rather than just two of them. Then, either add a helper
function that allocates and assigns the outer and inner side to a
stack-allocated two-element array, or just do that part in nodeSetOp.c
and nodeRecursiveunion.c and pass that to the array-accepting
function.

Review:

0001:

+ /* Left group is first, has no right matches */

Should the comma be an "or"?

0002:

Should this adjust the following nodeSetOp.c comment:

* In SETOP_SORTED mode, each input has been sorted according to all the
* grouping columns (ie, all the non-junk attributes). The SetOp node

This implies that there can be junk columns.

0003:

Looks fine. The only thing that I paused to think about was your
decision to double disable hashing when !enable_hashagg and the
results are bigger than get_hash_memory_limit(). I think have you have
makes sense there.

0004:

Looks fine. I see no reason to delay you pushing this.

0005:

Looks good.

David

#12

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: David Rowley (#11)

Re: Converting SetOp to read its two inputs separately

David Rowley <dgrowleyml@gmail.com> writes:

On Thu, 19 Dec 2024 at 15:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The common_result_slot_type() function I wrote here perhaps
should be made generally available, but I didn't do that yet.

I think it would be good to make this generic as it can be at least
used in nodeRecursiveunion.c and nodeAppend.c.

OK, done, and I added an 0006 patch that uses that infrastructure
in the obvious places.

I also addressed your remarks about comments. Otherwise I'm feeling
like this is about ready to push.

regards, tom lane

#13

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Tom Lane (#12)

Re: Converting SetOp to read its two inputs separately

On Fri, 20 Dec 2024 at 08:38, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Rowley <dgrowleyml@gmail.com> writes:

On Thu, 19 Dec 2024 at 15:44, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The common_result_slot_type() function I wrote here perhaps
should be made generally available, but I didn't do that yet.

I think it would be good to make this generic as it can be at least
used in nodeRecursiveunion.c and nodeAppend.c.

OK, done, and I added an 0006 patch that uses that infrastructure
in the obvious places.

That looks good. Thanks for adjusting the other node types too.

I also addressed your remarks about comments. Otherwise I'm feeling
like this is about ready to push.

I think so too.

One minor detail... I think the only thing I'd like to see is the
moving of the enable_hashagg checks to increment the disabled_nodes
count in create_setop_path() instead of where it's being called. I
understand there's only 1 caller of that function that passes
SETOP_HASHED, but it does seem nicer to put that logic where it
belongs. With how you have it now, if we were ever to grow any more
places that built SETOP_HASHED SetOpPaths, they'd also need to adjust
disabled_nodes manually and that seems easy to forget. Also, looking
around for references to "disabled_nodes", it looks like all other
places where we fiddle with the value of disabled_nodes are in
costsize.c. I understand we do check enable_hashagg in other places,
but those all seem to be so we avoid generating some Path rather than
to determine the disabled_node value.

David

#14

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: David Rowley (#13)

Re: Converting SetOp to read its two inputs separately

David Rowley <dgrowleyml@gmail.com> writes:

One minor detail... I think the only thing I'd like to see is the
moving of the enable_hashagg checks to increment the disabled_nodes
count in create_setop_path() instead of where it's being called. I
understand there's only 1 caller of that function that passes
SETOP_HASHED, but it does seem nicer to put that logic where it
belongs. With how you have it now, if we were ever to grow any more
places that built SETOP_HASHED SetOpPaths, they'd also need to adjust
disabled_nodes manually and that seems easy to forget. Also, looking
around for references to "disabled_nodes", it looks like all other
places where we fiddle with the value of disabled_nodes are in
costsize.c.

Looks like costsize.c and pathnode.c to me, but either way I take your
point. I'd not realized that Robert set it up that way, but now I see
he did. I agree that moving that bit of logic into
create_setop_path() seems better. I'll make it so and push.

regards, tom lane

#15

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: Tom Lane (#14)

Re: Converting SetOp to read its two inputs separately

Pushed ... and now I have one more beef about the way things are
in this area. I don't think we should leave the compatibility
function BuildTupleHashTable() in place in HEAD. Making it a
wrapper around a new function BuildTupleHashTableExt() was a fine
solution for preserving ABI in released branches, but that doesn't
mean we should clutter the code with unused ABI hacks forevermore.
Attached is a patch to take it out and then rename
BuildTupleHashTableExt() back to BuildTupleHashTable().

Since BuildTupleHashTableExt has already grown more arguments
in HEAD than it had in v17, renaming it doesn't increase the number of
places that will have to be touched in any extensions that were using
this infrastructure. Removal of the compatibility wrapper could force
some code updates, but really we want those places to update anyway.

I also made an effort at fixing the woefully out of date
header comment for it.

regards, tom lane

#16

David Rowley

dgrowleyml@gmail.com

over 1 year ago

In reply to: Tom Lane (#15)

Re: Converting SetOp to read its two inputs separately

On Fri, 20 Dec 2024 at 11:23, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Pushed ... and now I have one more beef about the way things are
in this area. I don't think we should leave the compatibility
function BuildTupleHashTable() in place in HEAD. Making it a
wrapper around a new function BuildTupleHashTableExt() was a fine
solution for preserving ABI in released branches, but that doesn't
mean we should clutter the code with unused ABI hacks forevermore.
Attached is a patch to take it out and then rename
BuildTupleHashTableExt() back to BuildTupleHashTable().

No complaints here. Thanks for cleaning that up.

I couldn't help but also notice the nbuckets parameter is using the
"long" datatype. The code in BuildTupleHashTable seems to think it's
fine to pass the long as the uint32 parameter to tuplehash_create().
I just thought if we're in the area of adjusting this API function's
signature then it might be worth fixing the "long" issue at the same
time, or at least in the same release.

I'm also quite keen to see less use of long as it's not a very
consistently sized datatype on all platforms which can lead to
platform dependent bugs.

David

#17

Tom Lane

tgl@sss.pgh.pa.us

over 1 year ago

In reply to: David Rowley (#16)

Re: Converting SetOp to read its two inputs separately

David Rowley <dgrowleyml@gmail.com> writes:

On Fri, 20 Dec 2024 at 11:23, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Attached is a patch to take it out and then rename
BuildTupleHashTableExt() back to BuildTupleHashTable().

No complaints here. Thanks for cleaning that up.

Thanks, will push.

I couldn't help but also notice the nbuckets parameter is using the
"long" datatype. The code in BuildTupleHashTable seems to think it's
fine to pass the long as the uint32 parameter to tuplehash_create().
I just thought if we're in the area of adjusting this API function's
signature then it might be worth fixing the "long" issue at the same
time, or at least in the same release.

I'm in favor of doing that, but it seems like a separate patch,
because we'd have to chase things back a fairly long way.
For instance, the numGroups fields in Agg, RecursiveUnion, and
SetOp are all "long" at the moment, and some of the callers
are getting their arguments via clamp_cardinality_to_long()
which'd need adjustment, etc etc.

I'm also quite keen to see less use of long as it's not a very
consistently sized datatype on all platforms which can lead to
platform dependent bugs.

Yeah. Switching all these places to int64 likely would be
worthwhile cleanup (but I'm not volunteering). Also, if
tuplehash_create expects a uint32, that isn't consistent
either.

regards, tom lane

Converting SetOp to read its two inputs separately

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments: