Aggregate transition state merging vs. hypothetical set functions
I started to look into fixing orderedsetaggs.c so that we could revert
52328727b, and soon found a rather nasty problem. Although the plain
OSAs seem amenable to supporting multiple finalfn calls on the same
transition state, the "hypothetical set" functions are not at all.
What they do is to accumulate all the regular aggregate input into
a tuplesort, then add the direct arguments as an additional "hypothetical"
row, and finally sort the result. There's no realistic way of undoing
the addition of the hypothetical row, so these finalfns simply can't
share tuplesort state.
You could imagine accumulating the regular input into a tuplestore
and then copying it into a tuplesort in each finalfn call. But that
seems pretty icky, and I'm not sure it'd really be any more performant
than just keeping separate tuplesort objects for each aggregate.
So my conclusion is that we *must* teach nodeAgg.c not to merge
transition states for these functions. But I do not want some
hard-wired rule that all and only hypothetical-set aggregates
are nonmergeable. Our notion of an AGGKIND_HYPOTHETICAL aggregate
is mostly a syntactical one about requiring there to be direct
argument(s) matching the aggregated argument(s) in type. I do not
think that that should translate to an assumption about how the
aggregate implementation works (indeed, the CREATE AGGREGATE man
page says in so many words that HYPOTHETICAL affects only type
matching, not runtime behavior). Moreover, having seen this example,
I think it's likely that there are other cases in which it's more
trouble than it's worth for an aggregate to support merging of
transition states.
Therefore, I think we need to bite the bullet and provide an aggregate
property (CREATE AGGREGATE argument / pg_aggregate column) that tells
whether the aggregate supports transition state merging. Likely this
should have been in the state-merging patch to begin with, but better
late than never.
The main thing that probably has to be hashed out before we can write
that patch is what the default should be for user-created aggregates.
I am inclined to think that we should err on the side of safety and
default it to false (no merge support). You could argue that the
lack of complaints since 9.6 came out is sufficient evidence that
defaulting to true would be all right, but I'm not sure.
Comments?
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 13 October 2017 at 12:08, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Therefore, I think we need to bite the bullet and provide an aggregate
property (CREATE AGGREGATE argument / pg_aggregate column) that tells
whether the aggregate supports transition state merging. Likely this
should have been in the state-merging patch to begin with, but better
late than never.The main thing that probably has to be hashed out before we can write
that patch is what the default should be for user-created aggregates.
I am inclined to think that we should err on the side of safety and
default it to false (no merge support). You could argue that the
lack of complaints since 9.6 came out is sufficient evidence that
defaulting to true would be all right, but I'm not sure.
Are you considering that this is an option only for ordered-set
aggregates or for all?
If the user defines their normal aggregate as not safe for merging,
then surely it'll not be suitable to be used as a window function
either, since the final function will also be called there multiple
times per state.
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
David Rowley <david.rowley@2ndquadrant.com> writes:
On 13 October 2017 at 12:08, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Therefore, I think we need to bite the bullet and provide an aggregate
property (CREATE AGGREGATE argument / pg_aggregate column) that tells
whether the aggregate supports transition state merging. Likely this
should have been in the state-merging patch to begin with, but better
late than never.
Are you considering that this is an option only for ordered-set
aggregates or for all?
All.
If the user defines their normal aggregate as not safe for merging,
then surely it'll not be suitable to be used as a window function
either, since the final function will also be called there multiple
times per state.
Yeah, we would probably also want to check the flag in nodeWindowAgg.
Not sure exactly how that should play out --- maybe we end up with
a tri-valued property "works as normal agg without merging, works
as normal agg with merging, works as window agg". But this would
arguably be an improvement over the current situation. Right now
I'm sure there are user-written aggs out there that will just crash
if used as a window agg, and the authors don't have much choice because
the performance costs of not modifying the transition state in the
finalfn are higher than they're willing to bear. At least with a
flag they could ensure that the case will fail cleanly.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 13 October 2017 at 12:41, Tom Lane <tgl@sss.pgh.pa.us> wrote:
David Rowley <david.rowley@2ndquadrant.com> writes:
If the user defines their normal aggregate as not safe for merging,
then surely it'll not be suitable to be used as a window function
either, since the final function will also be called there multiple
times per state.Yeah, we would probably also want to check the flag in nodeWindowAgg.
Not sure exactly how that should play out --- maybe we end up with
a tri-valued property "works as normal agg without merging, works
as normal agg with merging, works as window agg". But this would
arguably be an improvement over the current situation. Right now
I'm sure there are user-written aggs out there that will just crash
if used as a window agg, and the authors don't have much choice because
the performance costs of not modifying the transition state in the
finalfn are higher than they're willing to bear. At least with a
flag they could ensure that the case will fail cleanly.
hmm, maybe I'm lacking imagination here, but surely the final function
is either destructive or it's not? I can't understand what the
difference between nodeAgg.c calling the finalfn multiple times on the
same state and nodeWindowAgg.c doing it. Maybe there's something I'm
not accounting for that you are?
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
David Rowley <david.rowley@2ndquadrant.com> writes:
On 13 October 2017 at 12:41, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yeah, we would probably also want to check the flag in nodeWindowAgg.
Not sure exactly how that should play out --- maybe we end up with
a tri-valued property "works as normal agg without merging, works
as normal agg with merging, works as window agg".
hmm, maybe I'm lacking imagination here, but surely the final function
is either destructive or it's not? I can't understand what the
difference between nodeAgg.c calling the finalfn multiple times on the
same state and nodeWindowAgg.c doing it. Maybe there's something I'm
not accounting for that you are?
nodeWindowAgg is doing something more: not only is it calling the finalfn
repeatedly, but it's continuing to mutate the transition state in between.
The ordered-set aggs provide a counterexample to considering that to be
equivalent to state merging. The OSAs can cope with state merging as
long as they have a flag to make sure only the first finalfn does
tuplesort_performsort ... but that's not good enough to make them workable
as window aggs. Once we sort, we can't absorb more rows into the
tuplesort object.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10/13/2017 02:41 AM, Tom Lane wrote:
David Rowley <david.rowley@2ndquadrant.com> writes:
On 13 October 2017 at 12:08, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Therefore, I think we need to bite the bullet and provide an aggregate
property (CREATE AGGREGATE argument / pg_aggregate column) that tells
whether the aggregate supports transition state merging. Likely this
should have been in the state-merging patch to begin with, but better
late than never.Are you considering that this is an option only for ordered-set
aggregates or for all?All.
If the user defines their normal aggregate as not safe for merging,
then surely it'll not be suitable to be used as a window function
either, since the final function will also be called there multiple
times per state.Yeah, we would probably also want to check the flag in nodeWindowAgg.
Not sure exactly how that should play out --- maybe we end up with
a tri-valued property "works as normal agg without merging, works
as normal agg with merging, works as window agg". But this would
arguably be an improvement over the current situation. Right now
I'm sure there are user-written aggs out there that will just crash
if used as a window agg, and the authors don't have much choice because
the performance costs of not modifying the transition state in the
finalfn are higher than they're willing to bear. At least with a
flag they could ensure that the case will fail cleanly.
Sounds right to me. I'm not so sure there really are aggregates out
there that would crash today if used as a window aggregate, but it sure
would be nice to give some control on that.
We've been doing that window agg thing for a long time, so I think
"works as window agg" should be the default for regular aggregates. For
ordered-set aggregates, "no merging, no more transfn() calls after
finalfn()" seems safest.
It's a bit of a shame to have different defaults for regular and
ordered-set aggregates. But that is what we implicitly assume today,
anyway, and it's not too that hard to document.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 10/13/2017 02:08 AM, Tom Lane wrote:
I started to look into fixing orderedsetaggs.c so that we could revert
52328727b, and soon found a rather nasty problem. Although the plain
OSAs seem amenable to supporting multiple finalfn calls on the same
transition state, the "hypothetical set" functions are not at all.
What they do is to accumulate all the regular aggregate input into
a tuplesort, then add the direct arguments as an additional "hypothetical"
row, and finally sort the result. There's no realistic way of undoing
the addition of the hypothetical row, so these finalfns simply can't
share tuplesort state.You could imagine accumulating the regular input into a tuplestore
and then copying it into a tuplesort in each finalfn call. But that
seems pretty icky, and I'm not sure it'd really be any more performant
than just keeping separate tuplesort objects for each aggregate.
The current implementation, with the extra flag column, is quite naive.
We could add some support to tuplesort to find a row with given
attributes, and call that instead of actually adding the hypothetical
row to the tuplesort and iterating to re-find it after performsort. For
a result set that fits in memory, you could even do a binary search
instead of linearly iterating through the result set.
I'm not suggesting that we do that right now, though. And even if we
did, there might be other aggregates out there that are not safe to
merge, so we should still add the option to CREATE AGGREGATE. But
that'd be nice little project, if someone wanted to make those functions
faster.
- Heikki
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Heikki Linnakangas <hlinnaka@iki.fi> writes:
On 10/13/2017 02:08 AM, Tom Lane wrote:
I started to look into fixing orderedsetaggs.c so that we could revert
52328727b, and soon found a rather nasty problem. Although the plain
OSAs seem amenable to supporting multiple finalfn calls on the same
transition state, the "hypothetical set" functions are not at all.
What they do is to accumulate all the regular aggregate input into
a tuplesort, then add the direct arguments as an additional "hypothetical"
row, and finally sort the result. There's no realistic way of undoing
the addition of the hypothetical row, so these finalfns simply can't
share tuplesort state.
The current implementation, with the extra flag column, is quite naive.
We could add some support to tuplesort to find a row with given
attributes, and call that instead of actually adding the hypothetical
row to the tuplesort and iterating to re-find it after performsort. For
a result set that fits in memory, you could even do a binary search
instead of linearly iterating through the result set.
I'd had some idle thoughts about using a heap to support window
aggregation of an OSA, rather than ever performing a full sort.
It'd probably work well as long as the window doesn't get wide enough
that the heap stops fitting in memory. But in any case, the immediate
point is that we shouldn't just assume that every aggregate
implementation is capable of supporting state merging, and while we're
at it, it'd be better to have a non-syntax-based distinction between
aggs that can support use as window functions and aggs that can't.
Moving on to the exact color of the bikeshed: it seems like the right
way to present this to users of CREATE AGGREGATE is in terms of "does
the final function modify the transition state?". So maybe the values
could be spelled
SMODIFY = READ_ONLY ffunc never touches state, ok as window agg
SMODIFY = SHARABLE ffunc does some one-time change like sorting,
so state merging is OK but not window agg
SMODIFY = READ_WRITE ffunc trashes state, can't do merging either
I'm not set on these names by any means; anyone have a better idea?
Also, I'm not sure whether we'd need a separate MMODIFY flag for
the moving-aggregate sub-implementation, but it seems possible.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Heikki Linnakangas <hlinnaka@iki.fi> writes:
We've been doing that window agg thing for a long time, so I think
"works as window agg" should be the default for regular aggregates. For
ordered-set aggregates, "no merging, no more transfn() calls after
finalfn()" seems safest.
It's a bit of a shame to have different defaults for regular and
ordered-set aggregates. But that is what we implicitly assume today,
anyway, and it's not too that hard to document.
It's kind of a pain in the rear, really, especially for purposes like
pg_dump, which will have to know the rule to decide whether it can
omit an SMODIFY clause. (Which it should do if the value is default,
so as not to make back-porting of dumps harder than necessary.)
On reflection though I see little choice. Right now it's impossible
to use an OSA as a window agg at all because the parser doesn't support
that combination of options, and nodeWindowAgg is lacking necessary
support too. If we ever get around to relaxing that, it would really
be necessary that the default for user-created OSAs be "not safe as
window agg", because almost certainly they wouldn't be. So I think
there's no option but to have different default safety levels for OSA
and normal aggs. As long as we're forced into that anyway, it does
make sense to default to "not safe to merge" as well. We don't know
for sure whether there are any user-created OSAs in the wild, but
if there are, they likely borrowed code from orderedsetaggs.c.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
I wrote:
Moving on to the exact color of the bikeshed: it seems like the right
way to present this to users of CREATE AGGREGATE is in terms of "does
the final function modify the transition state?". So maybe the values
could be spelled
SMODIFY = READ_ONLY ffunc never touches state, ok as window agg
SMODIFY = SHARABLE ffunc does some one-time change like sorting,
so state merging is OK but not window agg
SMODIFY = READ_WRITE ffunc trashes state, can't do merging either
I'm not set on these names by any means; anyone have a better idea?
After contemplating the existing CREATE AGGREGATE parameters, particularly
[M]FINALFUNC_EXTRA, it seemed to me that better nomenclature is
[M]FINALFUNC_MODIFY = READ_ONLY
[M]FINALFUNC_MODIFY = STOP_UPDATES
[M]FINALFUNC_MODIFY = READ_WRITE
where "stop updates" is intended to imply "you can't call the transfn
anymore after the first finalfn call". I'm still not that much in
love with that terminology, but don't have a better idea now.
Attached is a WIP patch; I believe it's code-complete but the SGML
docs are lacking. Barring objections I'll finish up the docs and
push this.
regards, tom lane
Attachments:
mark-aggregate-finalfn-safety-explicitly.patchtext/x-diff; charset=us-ascii; name=mark-aggregate-finalfn-safety-explicitly.patchDownload+682-452
On 10/13/17 19:01, Tom Lane wrote:
Moving on to the exact color of the bikeshed: it seems like the right
way to present this to users of CREATE AGGREGATE is in terms of "does
the final function modify the transition state?". So maybe the values
could be spelled
SMODIFY = READ_ONLY ffunc never touches state, ok as window agg
SMODIFY = SHARABLE ffunc does some one-time change like sorting,
so state merging is OK but not window agg
SMODIFY = READ_WRITE ffunc trashes state, can't do merging either
I'm not set on these names by any means; anyone have a better idea?
Is "sharable" the preferred spelling, as opposed to "shareable"?
--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:
On 10/13/17 19:01, Tom Lane wrote:
SMODIFY = SHARABLE ffunc does some one-time change like sorting,
so state merging is OK but not window agg
Is "sharable" the preferred spelling, as opposed to "shareable"?
Hmm ... I looked in two different dictionaries, and they both say
sharable is a variant of shareable. So while it's not wrong exactly,
I seem to have picked the less preferred spelling.
I could run around and change it, but I'd have to do so *right now*
I think --- once we release beta1 the costs of changing this would
go up.
Will do so if there's not objections in an hour or so.
regards, tom lane