EXPLAIN VERBOSE with parallel Aggregate

Started by David Rowleyover 9 years ago30 messages

david.rowley@2ndquadrant.com

over 9 years ago

1 attachment(s)

There's 2 problems:

I recently noticed that EXPLAIN VERBOSE is a bit bogus when it comes
to parallel aggregates with FILTER (WHERE ...) clauses.

We get;

Output: pg_catalog.sum((sum(num) FILTER (WHERE (num > 0)))) FILTER
(WHERE (num > 0))

Which is simply a lie, we only filter on the partial aggregate, not the combine.

The attached patch just nullifies the combine aggregate's aggfilter
clause during setrefs. We cannot nullify it any sooner, as the
aggfilter is required so that we find the correct partial Aggref in
search_indexed_tlist_for_partial_aggref().

With the attached we get;

Output: pg_catalog.sum((sum(num) FILTER (WHERE (num > 0))))

The patch is very simple.

I'm unsure if the schema prefix on the combine aggregate is a problem
or not. The code path which generates it is rather unfortunate and is
down to func_get_detail() returning FUNCDETAIL_NOTFOUND in
generate_function_name() due to not being able to find a function
named "sum" with the transtype as its only argument. I had thought
that maybe we should add a pg_proc entry for the aggregate with the
transtype, if not already covered by the entry for aggfnoid.
Aggregates where the transtype is the same as the input type work just
fine;

Output: max((max(num)))

The problem with that is adding the pg_proc entry still won't get rid
of the schema as the "p_funcid == funcid" test in
generate_function_name() will fail causing the schema qualification to
occur again. But at least func_get_detail() would be successful in
finding the function.

Any one have any thoughts on if this is a problem?

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

parallel_agg_filter_fix.patchapplication/octet-stream; name=parallel_agg_filter_fix.patchDownload

diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 5537c14..c670d88 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -2462,6 +2462,13 @@ fix_combine_agg_expr_mutator(Node *node, fix_upper_expr_context *context)
 			 */
 			newtle = makeTargetEntry((Expr *) newvar, 1, NULL, false);
 			newaggref = (Aggref *) copyObject(aggref);
+
+			/*
+			 * aggfilter is irrelevant for combine nodes. Let's just nullify
+			 * it so it's not seen in EXPLAIN.
+			 */
+			newaggref->aggfilter = NULL;
+
 			newaggref->args = list_make1(newtle);
 
 			return (Node *) newaggref;

Robert Haas

robertmhaas@gmail.com

over 9 years ago

In reply to: David Rowley (#1)

Re: EXPLAIN VERBOSE with parallel Aggregate

On Wed, Apr 13, 2016 at 11:03 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

There's 2 problems:

1)

I recently noticed that EXPLAIN VERBOSE is a bit bogus when it comes
to parallel aggregates with FILTER (WHERE ...) clauses.

We get;

Output: pg_catalog.sum((sum(num) FILTER (WHERE (num > 0)))) FILTER
(WHERE (num > 0))

Which is simply a lie, we only filter on the partial aggregate, not the combine.

The attached patch just nullifies the combine aggregate's aggfilter
clause during setrefs. We cannot nullify it any sooner, as the
aggfilter is required so that we find the correct partial Aggref in
search_indexed_tlist_for_partial_aggref().

With the attached we get;

Output: pg_catalog.sum((sum(num) FILTER (WHERE (num > 0))))

The patch is very simple.

2)

I'm unsure if the schema prefix on the combine aggregate is a problem
or not. The code path which generates it is rather unfortunate and is
down to func_get_detail() returning FUNCDETAIL_NOTFOUND in
generate_function_name() due to not being able to find a function
named "sum" with the transtype as its only argument. I had thought
that maybe we should add a pg_proc entry for the aggregate with the
transtype, if not already covered by the entry for aggfnoid.
Aggregates where the transtype is the same as the input type work just
fine;

Output: max((max(num)))

The problem with that is adding the pg_proc entry still won't get rid
of the schema as the "p_funcid == funcid" test in
generate_function_name() will fail causing the schema qualification to
occur again. But at least func_get_detail() would be successful in
finding the function.

Any one have any thoughts on if this is a problem?

I definitely agree that the current output is messed up, but I'm not
sure your proposed output is much better. I wonder if it shouldn't
say something like:

Output: serialfn(transfn(args))

for the partial aggregate and

Output: finalfn(combinefn(deserialfn(args)))

for the finalize aggregate step.

Or maybe just insert the word PARTIAL before each partial aggregate
step, like PARTIAL sum(num) for the partial step and then just
sum(num) for the final step. I think ending up with sum(sum(num)) is
right out. It doesn't look so bad for that case but avg(avg(num))
would certainly imply something that's not the actual behavior.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Robert Haas (#2)

Re: EXPLAIN VERBOSE with parallel Aggregate

Robert Haas <robertmhaas@gmail.com> writes:

I definitely agree that the current output is messed up, but I'm not
sure your proposed output is much better. I wonder if it shouldn't
say something like:
Output: serialfn(transfn(args))
for the partial aggregate and
Output: finalfn(combinefn(deserialfn(args)))
for the finalize aggregate step.

Or maybe just insert the word PARTIAL before each partial aggregate
step, like PARTIAL sum(num) for the partial step and then just
sum(num) for the final step.

+1 for the latter, if we can do it conveniently. I think exposing
the names of the aggregate implementation functions would be very
user-unfriendly, as nobody but us hackers knows what those are.

I think ending up with sum(sum(num)) is
right out. It doesn't look so bad for that case but avg(avg(num))
would certainly imply something that's not the actual behavior.

Agreed.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

David Rowley

david.rowley@2ndquadrant.com

over 9 years ago

In reply to: Tom Lane (#3)

1 attachment(s)

Re: EXPLAIN VERBOSE with parallel Aggregate

On 16 April 2016 at 04:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

I definitely agree that the current output is messed up, but I'm not
sure your proposed output is much better. I wonder if it shouldn't
say something like:
Output: serialfn(transfn(args))
for the partial aggregate and
Output: finalfn(combinefn(deserialfn(args)))
for the finalize aggregate step.

Or maybe just insert the word PARTIAL before each partial aggregate
step, like PARTIAL sum(num) for the partial step and then just
sum(num) for the final step.

+1 for the latter, if we can do it conveniently. I think exposing
the names of the aggregate implementation functions would be very
user-unfriendly, as nobody but us hackers knows what those are.

It does not really seem all that convenient to do this. It also seems
a bit strange to me to have a parent node report a column which does
not exist in any nodes descending from it. Remember that the combine
Aggref does not have the same ->args as its corresponding partial
Aggref. It's not all that clear to me if there is any nice way to do
have this work the way you'd like. If we were to just call
get_rule_expr() on the first arg of the combine aggregate node, it
would re-print the PARTIAL keyword again.

In the attached I have it displaying as:

postgres=# explain verbose select count(*),max(a),avg(a) filter (where
a = 0) from t;
QUERY PLAN
-------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=14739.56..14739.57 rows=1 width=44)
Output: pg_catalog.count(*), max((max(a))), pg_catalog.avg((PARTIAL
avg(a) FILTER (WHERE (a = 0))))
-> Gather (cost=14739.33..14739.54 rows=2 width=44)
Output: (count(*)), (max(a)), (PARTIAL avg(a) FILTER (WHERE (a = 0)))
Workers Planned: 2
-> Partial Aggregate (cost=13739.33..13739.34 rows=1 width=44)
Output: count(*), max(a), PARTIAL avg(a) FILTER (WHERE (a = 0))
-> Parallel Seq Scan on public.t (cost=0.00..9572.67
rows=416667 width=4)
Output: a
(9 rows)

I like this much better, as there's no fudging of any function
arguments to print them as something they're not.

Note that max() does not get tagged with PARTIAL since it has no
finalfn. Obtaining this information does require a syscache lookup in
get_agg_expr(), but I've managed to make that only happen when
aggpartial is true.

I think ending up with sum(sum(num)) is
right out. It doesn't look so bad for that case but avg(avg(num))
would certainly imply something that's not the actual behavior.

Agreed.

Note that I've done nothing for the weird schema prefixing problem I
mentioned. I think I'd need input on the best way to solve this. If
it's actually a problem.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

parallel_agg_explain_verbose_v2.patchapplication/octet-stream; name=parallel_agg_explain_verbose_v2.patchDownload

diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c
index 4df4a9b..aeeac7b 100644
--- a/src/backend/executor/execQual.c
+++ b/src/backend/executor/execQual.c
@@ -4517,6 +4517,15 @@ ExecInitExpr(Expr *node, PlanState *parent)
 					AggState   *aggstate = (AggState *) parent;
 					Aggref	   *aggref = (Aggref *) node;
 
+					if (aggstate->finalizeAggs == aggref->aggpartial)
+					{
+						/* planner messed up */
+						if (aggref->aggpartial)
+							elog(ERROR, "partial Aggref found in finalized aggregate node");
+						else
+							elog(ERROR, "non-partial Aggref found in partial aggregate node");
+					}
+
 					if (aggstate->finalizeAggs &&
 						aggref->aggoutputtype != aggref->aggtype)
 					{
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index a21928b..04077ce 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -1244,6 +1244,7 @@ _copyAggref(const Aggref *from)
 	COPY_NODE_FIELD(aggfilter);
 	COPY_SCALAR_FIELD(aggstar);
 	COPY_SCALAR_FIELD(aggvariadic);
+	COPY_SCALAR_FIELD(aggpartial);
 	COPY_SCALAR_FIELD(aggkind);
 	COPY_SCALAR_FIELD(agglevelsup);
 	COPY_LOCATION_FIELD(location);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 3c6c567..213cb15 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -202,6 +202,7 @@ _equalAggref(const Aggref *a, const Aggref *b)
 	COMPARE_NODE_FIELD(aggfilter);
 	COMPARE_SCALAR_FIELD(aggstar);
 	COMPARE_SCALAR_FIELD(aggvariadic);
+	COMPARE_SCALAR_FIELD(aggpartial);
 	COMPARE_SCALAR_FIELD(aggkind);
 	COMPARE_SCALAR_FIELD(agglevelsup);
 	COMPARE_LOCATION_FIELD(location);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 5ac7446..e8eaf10 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1040,6 +1040,7 @@ _outAggref(StringInfo str, const Aggref *node)
 	WRITE_NODE_FIELD(aggfilter);
 	WRITE_BOOL_FIELD(aggstar);
 	WRITE_BOOL_FIELD(aggvariadic);
+	WRITE_BOOL_FIELD(aggpartial);
 	WRITE_CHAR_FIELD(aggkind);
 	WRITE_UINT_FIELD(agglevelsup);
 	WRITE_LOCATION_FIELD(location);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 8059594..7ff9c91 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -556,6 +556,7 @@ _readAggref(void)
 	READ_NODE_FIELD(aggfilter);
 	READ_BOOL_FIELD(aggstar);
 	READ_BOOL_FIELD(aggvariadic);
+	READ_BOOL_FIELD(aggpartial);
 	READ_CHAR_FIELD(aggkind);
 	READ_UINT_FIELD(agglevelsup);
 	READ_LOCATION_FIELD(location);
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 5537c14..a1b1db4 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -2464,6 +2464,13 @@ fix_combine_agg_expr_mutator(Node *node, fix_upper_expr_context *context)
 			newaggref = (Aggref *) copyObject(aggref);
 			newaggref->args = list_make1(newtle);
 
+			/*
+			 * A combine aggregate node does not perform FILTER, so we'd
+			 * better remove any filter clauses so they're not shown in
+			 * EXPLAIN.
+			 */
+			newaggref->aggfilter = NULL;
+
 			return (Node *) newaggref;
 		}
 		else
diff --git a/src/backend/optimizer/util/tlist.c b/src/backend/optimizer/util/tlist.c
index 4c8c83d..c7872b6 100644
--- a/src/backend/optimizer/util/tlist.c
+++ b/src/backend/optimizer/util/tlist.c
@@ -785,6 +785,7 @@ apply_partialaggref_adjustment(PathTarget *target)
 			aggform = (Form_pg_aggregate) GETSTRUCT(aggTuple);
 
 			newaggref = (Aggref *) copyObject(aggref);
+			newaggref->aggpartial = true;
 
 			/* use the serialization type, if one exists */
 			if (OidIsValid(aggform->aggserialtype))
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 2b47e95..d22eb15 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -8215,6 +8215,31 @@ get_agg_expr(Aggref *aggref, deparse_context *context)
 	/* Extract the argument types as seen by the parser */
 	nargs = get_aggregate_argtypes(aggref, argtypes);
 
+	/*
+	 * We prefix partial aggregates which have final functions with the
+	 * keyword "PARTIAL". This avoids printing confusing things like
+	 * avg(avg(x)). We'll happily print max(max(x)), since aggregates like max
+	 * have no final function.
+	 */
+	if (aggref->aggpartial)
+	{
+		HeapTuple	aggTuple;
+		Form_pg_aggregate aggform;
+
+		/* check if the aggregate has a final function */
+		aggTuple = SearchSysCache1(AGGFNOID,
+							   ObjectIdGetDatum(aggref->aggfnoid));
+		if (!HeapTupleIsValid(aggTuple))
+			elog(ERROR, "cache lookup failed for aggregate %u",
+				 aggref->aggfnoid);
+		aggform = (Form_pg_aggregate) GETSTRUCT(aggTuple);
+
+		if (OidIsValid(aggform->aggfinalfn))
+			appendStringInfoString(buf, "PARTIAL ");
+
+		ReleaseSysCache(aggTuple);
+	}
+
 	/* Print the aggregate name, schema-qualified if needed */
 	appendStringInfo(buf, "%s(%s",
 					 generate_function_name(aggref->aggfnoid, nargs,
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 1ffc0a1..ac23cab 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -280,6 +280,7 @@ typedef struct Aggref
 	bool		aggstar;		/* TRUE if argument list was really '*' */
 	bool		aggvariadic;	/* true if variadic arguments have been
 								 * combined into an array last argument */
+	bool		aggpartial;		/* true if belongs to partial agg node */
 	char		aggkind;		/* aggregate kind (see pg_aggregate.h) */
 	Index		agglevelsup;	/* > 0 if agg belongs to outer query */
 	int			location;		/* token location, or -1 if unknown */

Tom Lane

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: David Rowley (#4)

Re: EXPLAIN VERBOSE with parallel Aggregate

David Rowley <david.rowley@2ndquadrant.com> writes:

On 16 April 2016 at 04:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:

+1 for the latter, if we can do it conveniently. I think exposing
the names of the aggregate implementation functions would be very
user-unfriendly, as nobody but us hackers knows what those are.

It does not really seem all that convenient to do this. It also seems
a bit strange to me to have a parent node report a column which does
not exist in any nodes descending from it. Remember that the combine
Aggref does not have the same ->args as its corresponding partial
Aggref. It's not all that clear to me if there is any nice way to do
have this work the way you'd like. If we were to just call
get_rule_expr() on the first arg of the combine aggregate node, it
would re-print the PARTIAL keyword again.

This suggests to me that the parsetree representation for partial
aggregates was badly chosen. If EXPLAIN can't recognize them, then
neither can anything else, and we may soon find needs for telling
the difference that are not just cosmetic. Maybe we need more
fields in Aggref.

Note that I've done nothing for the weird schema prefixing problem I
mentioned. I think I'd need input on the best way to solve this. If
it's actually a problem.

It sure looks like a problem to me. Maybe it's only cosmetic, but
it's going to cause confusion and bug reports. EXPLAIN output for
parallel queries is going to be confusing enough anyway; we need
to avoid having artifacts like this.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Noah Misch

noah@leadboat.com

over 9 years ago

In reply to: Tom Lane (#5)

Re: EXPLAIN VERBOSE with parallel Aggregate

On Sun, Apr 17, 2016 at 10:22:24AM -0400, Tom Lane wrote:

David Rowley <david.rowley@2ndquadrant.com> writes:

On 16 April 2016 at 04:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:

+1 for the latter, if we can do it conveniently. I think exposing
the names of the aggregate implementation functions would be very
user-unfriendly, as nobody but us hackers knows what those are.

It does not really seem all that convenient to do this. It also seems
a bit strange to me to have a parent node report a column which does
not exist in any nodes descending from it. Remember that the combine
Aggref does not have the same ->args as its corresponding partial
Aggref. It's not all that clear to me if there is any nice way to do
have this work the way you'd like. If we were to just call
get_rule_expr() on the first arg of the combine aggregate node, it
would re-print the PARTIAL keyword again.

This suggests to me that the parsetree representation for partial
aggregates was badly chosen. If EXPLAIN can't recognize them, then
neither can anything else, and we may soon find needs for telling
the difference that are not just cosmetic. Maybe we need more
fields in Aggref.

[This is a generic notification.]

The above-described topic is currently a PostgreSQL 9.6 open item. Robert,
since you committed the patch believed to have created it, you own this open
item. If that responsibility lies elsewhere, please let us know whose
responsibility it is to fix this. Since new open items may be discovered at
any time and I want to plan to have them all fixed well in advance of the ship
date, I will appreciate your efforts toward speedy resolution. Please
present, within 72 hours, a plan to fix the defect within seven days of this
message. Thanks.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

over 9 years ago

In reply to: Noah Misch (#6)

Re: EXPLAIN VERBOSE with parallel Aggregate

On Tue, Apr 19, 2016 at 9:22 PM, Noah Misch <noah@leadboat.com> wrote:

On Sun, Apr 17, 2016 at 10:22:24AM -0400, Tom Lane wrote:

David Rowley <david.rowley@2ndquadrant.com> writes:

On 16 April 2016 at 04:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:

+1 for the latter, if we can do it conveniently. I think exposing
the names of the aggregate implementation functions would be very
user-unfriendly, as nobody but us hackers knows what those are.

It does not really seem all that convenient to do this. It also seems
a bit strange to me to have a parent node report a column which does
not exist in any nodes descending from it. Remember that the combine
Aggref does not have the same ->args as its corresponding partial
Aggref. It's not all that clear to me if there is any nice way to do
have this work the way you'd like. If we were to just call
get_rule_expr() on the first arg of the combine aggregate node, it
would re-print the PARTIAL keyword again.

This suggests to me that the parsetree representation for partial
aggregates was badly chosen. If EXPLAIN can't recognize them, then
neither can anything else, and we may soon find needs for telling
the difference that are not just cosmetic. Maybe we need more
fields in Aggref.

[This is a generic notification.]

The above-described topic is currently a PostgreSQL 9.6 open item. Robert,
since you committed the patch believed to have created it, you own this open
item. If that responsibility lies elsewhere, please let us know whose
responsibility it is to fix this. Since new open items may be discovered at
any time and I want to plan to have them all fixed well in advance of the ship
date, I will appreciate your efforts toward speedy resolution. Please
present, within 72 hours, a plan to fix the defect within seven days of this
message. Thanks.

I'll do my best to work on this soon. I'm not happy with the output
produced by David's patch, but I don't expect I'll be able to do
better without putting some time into it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

David Rowley

david.rowley@2ndquadrant.com

over 9 years ago

In reply to: Tom Lane (#5)

1 attachment(s)

Re: EXPLAIN VERBOSE with parallel Aggregate

On 18 April 2016 at 02:22, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Rowley <david.rowley@2ndquadrant.com> writes:

Note that I've done nothing for the weird schema prefixing problem I
mentioned. I think I'd need input on the best way to solve this. If
it's actually a problem.

It sure looks like a problem to me. Maybe it's only cosmetic, but
it's going to cause confusion and bug reports. EXPLAIN output for
parallel queries is going to be confusing enough anyway; we need
to avoid having artifacts like this.

I'd like to admit that I'm a bit confused as to why
generate_function_name(), when it already knows the funcid, bothers to
call func_get_detail(), which performs a search for the function based
on the name and argument types, to find the function, most likely with
the same funcid as the one which it already knew.

Could you explain why this has to happen?

I also tried patching with the attached and running the regression
tests to see what breaks... nothing did. So it seems that, at least in
the tests that that code path is never hit.

With that in mind, perhaps the fix for the namespace problem is just
to special case the combine Aggrefs in get_agg_expr() and have it just
lookup the pg_proc entry for the aggred->aggfnoid, and use that
proname.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

generate_function_name_experiment.patchapplication/octet-stream; name=generate_function_name_experiment.patchDownload

diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 2b47e95..bace60d 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -9708,12 +9708,17 @@ generate_function_name(Oid funcid, int nargs, List *argnames, Oid *argtypes,
 	 * find it.
 	 */
 	if (!force_qualify)
+	{
 		p_result = func_get_detail(list_make1(makeString(proname)),
 								   NIL, argnames, nargs, argtypes,
 								   !use_variadic, true,
 								   &p_funcid, &p_rettype,
 								   &p_retset, &p_nvargs, &p_vatype,
 								   &p_true_typeids, NULL);
+
+		if (funcid != p_funcid)
+			elog(ERROR, "!!!! %s() %u != %u", proname, funcid, p_funcid);
+	}
 	else
 	{
 		p_result = FUNCDETAIL_NOTFOUND;

Tom Lane

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: David Rowley (#8)

Re: EXPLAIN VERBOSE with parallel Aggregate

David Rowley <david.rowley@2ndquadrant.com> writes:

I'd like to admit that I'm a bit confused as to why
generate_function_name(), when it already knows the funcid, bothers to
call func_get_detail(), which performs a search for the function based
on the name and argument types, to find the function, most likely with
the same funcid as the one which it already knew.

Could you explain why this has to happen?

The point is exactly to find out whether a search for the function
given only the name and argument types would find the same function,
or a similar function in a different schema --- which would happen
if that other schema is earlier in the search path than the desired
one, or maybe the desired one isn't in search_path at all. In such
a case we must schema-qualify the function name in the printed
expression.

To some extent this is because ruleutils serves two masters. We would
probably not care that much about schema exactness for EXPLAIN's purposes,
but we do care for dumping views and rules.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

David Rowley

david.rowley@2ndquadrant.com

over 9 years ago

In reply to: Tom Lane (#9)

1 attachment(s)

Re: EXPLAIN VERBOSE with parallel Aggregate

On 21 April 2016 at 17:14, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Rowley <david.rowley@2ndquadrant.com> writes:

I'd like to admit that I'm a bit confused as to why
generate_function_name(), when it already knows the funcid, bothers to
call func_get_detail(), which performs a search for the function based
on the name and argument types, to find the function, most likely with
the same funcid as the one which it already knew.

Could you explain why this has to happen?

The point is exactly to find out whether a search for the function
given only the name and argument types would find the same function,
or a similar function in a different schema --- which would happen
if that other schema is earlier in the search path than the desired
one, or maybe the desired one isn't in search_path at all. In such
a case we must schema-qualify the function name in the printed
expression.

Thanks. That makes more sense now.

To some extent this is because ruleutils serves two masters. We would
probably not care that much about schema exactness for EXPLAIN's purposes,
but we do care for dumping views and rules.

OK, so here's my thoughts. Currently, as mentioned above, I've
included a PARTIAL prefix for partial aggregates. This is
syntactically incorrect for the dumping of views etc, but that should
not matter as partial Aggrefs never come from the parser, they're only
perhaps generated later in the planner. Same goes for combine
aggregates too.

In the attached I'm proposing that we simply just use the
pg_proc.proname which belongs to the aggref->aggfnoid for combine
aggregates. This gets around the function not being found by
generate_function_name() and the namespace problem, that code should
never be executed when getting a view def, since we can't have combine
aggs there.

The attached still does not get the output into the way Robert would
have liked, but I still stand by my dislike to pretending the combine
aggregate is a normal aggregate. It's not all that clear if FILTER
should be displayed in the combine agg. Combine Aggs don't do FILTER.

This makes the output:

postgres=# explain verbose select avg(num) FILTER (WHERE num >
0),sum(num),count(*) from i;
QUERY PLAN
-------------------------------------------------------------------------------------------
Finalize Aggregate (cost=13758.56..13758.57 rows=1 width=48)
Output: avg((PARTIAL avg(num) FILTER (WHERE (num > 0)))),
sum((sum(num))), count(*)
-> Gather (cost=13758.33..13758.54 rows=2 width=48)
Output: (PARTIAL avg(num) FILTER (WHERE (num > 0))),
(sum(num)), (count(*))
Workers Planned: 2
-> Partial Aggregate (cost=12758.33..12758.34 rows=1 width=48)
Output: PARTIAL avg(num) FILTER (WHERE (num > 0)),
sum(num), count(*)
-> Parallel Seq Scan on public.i (cost=0.00..8591.67
rows=416667 width=4)
Output: num

Comments welcome.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

parallel_agg_explain_verbose_v3.patchapplication/octet-stream; name=parallel_agg_explain_verbose_v3.patchDownload

diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c
index 4df4a9b..8f240ae 100644
--- a/src/backend/executor/execQual.c
+++ b/src/backend/executor/execQual.c
@@ -4517,6 +4517,24 @@ ExecInitExpr(Expr *node, PlanState *parent)
 					AggState   *aggstate = (AggState *) parent;
 					Aggref	   *aggref = (Aggref *) node;
 
+					if (aggstate->finalizeAggs == aggref->aggpartial)
+					{
+						/* planner messed up */
+						if (aggref->aggpartial)
+							elog(ERROR, "partial Aggref found in finalized aggregate node");
+						else
+							elog(ERROR, "non-partial Aggref found in partial aggregate node");
+					}
+
+					if (aggstate->combineStates != aggref->aggcombine)
+					{
+						/* planner messed up */
+						if (aggref->aggcombine)
+							elog(ERROR, "combine Aggref found in non-combine aggregate node");
+						else
+							elog(ERROR, "non-combine Aggref found in combine aggregate node");
+					}
+
 					if (aggstate->finalizeAggs &&
 						aggref->aggoutputtype != aggref->aggtype)
 					{
diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index a21928b..08ed990 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -1244,6 +1244,8 @@ _copyAggref(const Aggref *from)
 	COPY_NODE_FIELD(aggfilter);
 	COPY_SCALAR_FIELD(aggstar);
 	COPY_SCALAR_FIELD(aggvariadic);
+	COPY_SCALAR_FIELD(aggcombine);
+	COPY_SCALAR_FIELD(aggpartial);
 	COPY_SCALAR_FIELD(aggkind);
 	COPY_SCALAR_FIELD(agglevelsup);
 	COPY_LOCATION_FIELD(location);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 3c6c567..c5ccc42 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -202,6 +202,8 @@ _equalAggref(const Aggref *a, const Aggref *b)
 	COMPARE_NODE_FIELD(aggfilter);
 	COMPARE_SCALAR_FIELD(aggstar);
 	COMPARE_SCALAR_FIELD(aggvariadic);
+	COMPARE_SCALAR_FIELD(aggcombine);
+	COMPARE_SCALAR_FIELD(aggpartial);
 	COMPARE_SCALAR_FIELD(aggkind);
 	COMPARE_SCALAR_FIELD(agglevelsup);
 	COMPARE_LOCATION_FIELD(location);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 5ac7446..c2f0e0f 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1040,6 +1040,8 @@ _outAggref(StringInfo str, const Aggref *node)
 	WRITE_NODE_FIELD(aggfilter);
 	WRITE_BOOL_FIELD(aggstar);
 	WRITE_BOOL_FIELD(aggvariadic);
+	WRITE_BOOL_FIELD(aggcombine);
+	WRITE_BOOL_FIELD(aggpartial);
 	WRITE_CHAR_FIELD(aggkind);
 	WRITE_UINT_FIELD(agglevelsup);
 	WRITE_LOCATION_FIELD(location);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 8059594..6f28047 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -556,6 +556,8 @@ _readAggref(void)
 	READ_NODE_FIELD(aggfilter);
 	READ_BOOL_FIELD(aggstar);
 	READ_BOOL_FIELD(aggvariadic);
+	READ_BOOL_FIELD(aggcombine);
+	READ_BOOL_FIELD(aggpartial);
 	READ_CHAR_FIELD(aggkind);
 	READ_UINT_FIELD(agglevelsup);
 	READ_LOCATION_FIELD(location);
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 5537c14..bd6ad9cc 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -2463,6 +2463,14 @@ fix_combine_agg_expr_mutator(Node *node, fix_upper_expr_context *context)
 			newtle = makeTargetEntry((Expr *) newvar, 1, NULL, false);
 			newaggref = (Aggref *) copyObject(aggref);
 			newaggref->args = list_make1(newtle);
+			newaggref->aggcombine = true;
+
+			/*
+			 * A combine aggregate node does not perform FILTER, so we'd
+			 * better remove any filter clauses so they're not shown in
+			 * EXPLAIN.
+			 */
+			newaggref->aggfilter = NULL;
 
 			return (Node *) newaggref;
 		}
diff --git a/src/backend/optimizer/util/tlist.c b/src/backend/optimizer/util/tlist.c
index 4c8c83d..c7872b6 100644
--- a/src/backend/optimizer/util/tlist.c
+++ b/src/backend/optimizer/util/tlist.c
@@ -785,6 +785,7 @@ apply_partialaggref_adjustment(PathTarget *target)
 			aggform = (Form_pg_aggregate) GETSTRUCT(aggTuple);
 
 			newaggref = (Aggref *) copyObject(aggref);
+			newaggref->aggpartial = true;
 
 			/* use the serialization type, if one exists */
 			if (OidIsValid(aggform->aggserialtype))
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 1b8f0ae..6ed316e 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -8233,14 +8233,63 @@ get_agg_expr(Aggref *aggref, deparse_context *context)
 	/* Extract the argument types as seen by the parser */
 	nargs = get_aggregate_argtypes(aggref, argtypes);
 
-	/* Print the aggregate name, schema-qualified if needed */
-	appendStringInfo(buf, "%s(%s",
-					 generate_function_name(aggref->aggfnoid, nargs,
-											NIL, argtypes,
-											aggref->aggvariadic,
-											&use_variadic,
-											context->special_exprkind),
-					 (aggref->aggdistinct != NIL) ? "DISTINCT " : "");
+	/*
+	 * We prefix partial aggregates which have final functions with the
+	 * keyword "PARTIAL". This avoids printing confusing things like
+	 * avg(avg(x)). We'll happily print max(max(x)), since aggregates like max
+	 * have no final function.
+	 */
+	if (aggref->aggpartial)
+	{
+		HeapTuple	aggTuple;
+		Form_pg_aggregate aggform;
+
+		/* check if the aggregate has a final function */
+		aggTuple = SearchSysCache1(AGGFNOID,
+							   ObjectIdGetDatum(aggref->aggfnoid));
+		if (!HeapTupleIsValid(aggTuple))
+			elog(ERROR, "cache lookup failed for aggregate %u",
+				 aggref->aggfnoid);
+		aggform = (Form_pg_aggregate) GETSTRUCT(aggTuple);
+
+		if (OidIsValid(aggform->aggfinalfn))
+			appendStringInfoString(buf, "PARTIAL ");
+
+		ReleaseSysCache(aggTuple);
+	}
+
+	/*
+	 * Combine Aggrefs should never be generated from the parser, so it's OK
+	 * to just display the proname as the function name, and not bother
+	 * looking it up in generate_function_name(). This gets us around the
+	 * problem that the args of a combine Aggref are not suitable for
+	 * generate_function_name() to find the correct aggregate function.
+	 */
+	if (aggref->aggcombine)
+	{
+		HeapTuple	proctup;
+		Form_pg_proc procform;
+		Oid funcid = aggref->aggfnoid;
+
+		proctup = SearchSysCache1(PROCOID, ObjectIdGetDatum(funcid));
+		if (!HeapTupleIsValid(proctup))
+			elog(ERROR, "cache lookup failed for function %u", funcid);
+		procform = (Form_pg_proc) GETSTRUCT(proctup);
+		appendStringInfo(buf, "%s(%s", NameStr(procform->proname),
+						 (aggref->aggdistinct != NIL) ? "DISTINCT " : "");
+		ReleaseSysCache(proctup);
+	}
+	else
+	{
+		/* Print the aggregate name, schema-qualified if needed */
+		appendStringInfo(buf, "%s(%s",
+						 generate_function_name(aggref->aggfnoid, nargs,
+												NIL, argtypes,
+												aggref->aggvariadic,
+												&use_variadic,
+												context->special_exprkind),
+						 (aggref->aggdistinct != NIL) ? "DISTINCT " : "");
+	}
 
 	if (AGGKIND_IS_ORDERED_SET(aggref->aggkind))
 	{
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 1ffc0a1..c4ed544 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -261,6 +261,10 @@ typedef struct Param
  * interesting happens in the Aggref itself, but we must set the output data
  * type to whatever type is used for transition values.
  *
+ * 'aggcombine' and 'aggpartial' should never be true for Aggrefs which come
+ * from the parser. Such Aggrefs are only ever generated inside the planner
+ * in order to support multi-phase aggregation.
+ *
  * Note: If you are adding fields here you may also need to add a comparison
  * in search_indexed_tlist_for_partial_aggref()
  */
@@ -280,6 +284,8 @@ typedef struct Aggref
 	bool		aggstar;		/* TRUE if argument list was really '*' */
 	bool		aggvariadic;	/* true if variadic arguments have been
 								 * combined into an array last argument */
+	bool		aggcombine;		/* true if belongs to combine agg node */
+	bool		aggpartial;		/* true if belongs to partial agg node */
 	char		aggkind;		/* aggregate kind (see pg_aggregate.h) */
 	Index		agglevelsup;	/* > 0 if agg belongs to outer query */
 	int			location;		/* token location, or -1 if unknown */

#11

Robert Haas

robertmhaas@gmail.com

over 9 years ago

In reply to: David Rowley (#10)

Re: EXPLAIN VERBOSE with parallel Aggregate

On Thu, Apr 21, 2016 at 8:57 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

OK, so here's my thoughts. Currently, as mentioned above, I've
included a PARTIAL prefix for partial aggregates. This is
syntactically incorrect for the dumping of views etc, but that should
not matter as partial Aggrefs never come from the parser, they're only
perhaps generated later in the planner. Same goes for combine
aggregates too.

Yeah. As I've said a few times, I would like to have SQL syntax that
emits the unfinalized (but serialized, if type internal) values, so
that postgres_fdw can use that to partial aggregation to the remote
side. Maybe we should consider what would be reasonable syntax for
this; but that's a second-order consideration for now.

The attached still does not get the output into the way Robert would
have liked, but I still stand by my dislike to pretending the combine
aggregate is a normal aggregate. It's not all that clear if FILTER
should be displayed in the combine agg. Combine Aggs don't do FILTER.

I am rather confused by this, for two reasons:

1. The "Output" line is supposed to display the columns that the plan
node is producing. And a FinalizeAggregate had darned well better be
producing the same results as an Aggregate, so it's entirely
reasonable for it to produce the same output that an Aggregate would
have given us.

2. You're using the term "combine agg", but as far as the EXPLAIN
ANALYZE output is concerned, that's not a thing. There is
PartialAggregate, Aggregate, and FinalizeAggregate. I think you mean
FinalizeAggregate when you say "combine aggregate", since we identify
a node as a FinalizeAggregate by observing that combineStates = true.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Robert Haas

robertmhaas@gmail.com

over 9 years ago

In reply to: Tom Lane (#5)

Re: EXPLAIN VERBOSE with parallel Aggregate

On Sun, Apr 17, 2016 at 10:22 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

David Rowley <david.rowley@2ndquadrant.com> writes:

On 16 April 2016 at 04:27, Tom Lane <tgl@sss.pgh.pa.us> wrote:

+1 for the latter, if we can do it conveniently. I think exposing
the names of the aggregate implementation functions would be very
user-unfriendly, as nobody but us hackers knows what those are.

It does not really seem all that convenient to do this. It also seems
a bit strange to me to have a parent node report a column which does
not exist in any nodes descending from it. Remember that the combine
Aggref does not have the same ->args as its corresponding partial
Aggref. It's not all that clear to me if there is any nice way to do
have this work the way you'd like. If we were to just call
get_rule_expr() on the first arg of the combine aggregate node, it
would re-print the PARTIAL keyword again.

This suggests to me that the parsetree representation for partial
aggregates was badly chosen. If EXPLAIN can't recognize them, then
neither can anything else, and we may soon find needs for telling
the difference that are not just cosmetic. Maybe we need more
fields in Aggref.

So the basic problem is this code in setrefs.c:

newvar = search_indexed_tlist_for_partial_aggref(aggref,
context->subplan_itlist,
context->newvarno);
if (newvar)
{
Aggref *newaggref;
TargetEntry *newtle;

/*
* Now build a new TargetEntry for the Aggref's arguments which is
* a single Var which references the corresponding AggRef in the
* node below.
*/
newtle = makeTargetEntry((Expr *) newvar, 1, NULL, false);
newaggref = (Aggref *) copyObject(aggref);
newaggref->args = list_make1(newtle);

So what ends up happening is that, before setrefs processing, the
FinalizeAggregate's aggref has the same argument list as what the user
originally specified, but during that process, we replace the original
argument list with a 1-element argument list that points to the output
of the partial aggregate step. After that, it's unsurprising that the
deparsing logic goes wrong; consider especially the case of an
aggregate that originally took any number of arguments other than one.
Offhand, I see two somewhat reasonable strategies for trying to fix
this:

1. Add a field "List *origargs" to Aggref, and in this case set
newaggref->origargs to a copy of the original argument list. Use
origargs for deparsing, unless it's NULL, in which case use args. Or
alternative, always populate origargs, but in other cases just make it
equal to args.

2. Add a field "bool aggcombine" to args, and set it to true in this
case. When we see that in deparsing, expect the argument list to be
one element long, a TargetEntry containing a Var. Use that to dig out
the partial Aggref to which it points, and deparse that instead. I
guess maybe get_variable() could be used for this purpose.

There might be another approach, too. Thoughts?

(Note that I'm assuming here that the final aggregate's target list
output should match what we would have gotten from a regular
aggregate, despite the combining stage in the middle. I think that's
correct; we're outputting the same thing, even though we computed it
differently.)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

David Rowley

david.rowley@2ndquadrant.com

over 9 years ago

In reply to: Robert Haas (#11)

Re: EXPLAIN VERBOSE with parallel Aggregate

On 23 April 2016 at 06:35, Robert Haas <robertmhaas@gmail.com> wrote:

2. You're using the term "combine agg", but as far as the EXPLAIN
ANALYZE output is concerned, that's not a thing. There is
PartialAggregate, Aggregate, and FinalizeAggregate. I think you mean
FinalizeAggregate when you say "combine aggregate", since we identify
a node as a FinalizeAggregate by observing that combineStates = true.

I really do mean combine when I say combine. I'm pretty sure that I've
coded everything properly to work even if there was 10 stages of
aggregation, in this case 9 of those would be combine nodes, and only
1 of them a finalize node. I can see why you'd say that at a glace of
explain.c, but it's more relevant that agg->finalizeAggs is true,
giving the combine test is in the else if condition. The test for
combineStates is only there so we don't output Finalize Aggregate for
normal 1 stage aggregates.

I simply don't think FILTER should be shown if combineStates is true.
This means that it will only be shown at the first aggregate node, the
one with combineStates == false. I think this is a good idea due to
the fact that this is the only place where FILTERing is done.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

David Rowley

david.rowley@2ndquadrant.com

over 9 years ago

In reply to: Robert Haas (#12)

Re: EXPLAIN VERBOSE with parallel Aggregate

On 23 April 2016 at 09:19, Robert Haas <robertmhaas@gmail.com> wrote:

2. Add a field "bool aggcombine" to args, and set it to true in this
case. When we see that in deparsing, expect the argument list to be
one element long, a TargetEntry containing a Var. Use that to dig out
the partial Aggref to which it points, and deparse that instead. I
guess maybe get_variable() could be used for this purpose.

I did do that at one stage. I think perhaps one of the above patches
did it that way. The problem was that because I was just detecting
combine Aggrefs and printing the first item in args, it meant the
PARTIAL word was printed again, and that was pretty bogus, since it
was not a partial agg. I didn't see a way to do this without adding
some extra bool parameter to that whole series of functions. But I
only looked at using get_rule_expr(). I didn't look at what
get_variable() is.

There might be another approach, too. Thoughts?

(Note that I'm assuming here that the final aggregate's target list
output should match what we would have gotten from a regular
aggregate, despite the combining stage in the middle. I think that's
correct; we're outputting the same thing, even though we computed it
differently.)

Please note that in this case the final and combine node are the same
node, so I'm confused by the "combining stage in the middle" part.
There's only 2 aggregate nodes. I'm not sure how one of those is in
the middle.

I really don't think that we should print FILTER details in a combine
aggregate node. We'd be claiming to be doing something that we're
actually not doing. Please see advance_aggregates() in nodeAgg.c, and
compare that to combine_aggregates(), which is used when combineStates
== true. Notice that only advance_aggregates() bothers with the
aggfilter clause.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Robert Haas

robertmhaas@gmail.com

over 9 years ago

In reply to: David Rowley (#14)

Re: EXPLAIN VERBOSE with parallel Aggregate

On Fri, Apr 22, 2016 at 5:36 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

I really don't think that we should print FILTER details in a combine
aggregate node. We'd be claiming to be doing something that we're
actually not doing. Please see advance_aggregates() in nodeAgg.c, and
compare that to combine_aggregates(), which is used when combineStates
== true. Notice that only advance_aggregates() bothers with the
aggfilter clause.

I think you're wrong. The Output line says what that node outputs,
not how it's computed. And a FinalizeAggregate on top of a
PartialAggregate produces the same output as an Aggregate.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

David Rowley

david.rowley@2ndquadrant.com

over 9 years ago

In reply to: Robert Haas (#15)

Re: EXPLAIN VERBOSE with parallel Aggregate

On 23 April 2016 at 13:58, Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Apr 22, 2016 at 5:36 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

I really don't think that we should print FILTER details in a combine
aggregate node. We'd be claiming to be doing something that we're
actually not doing. Please see advance_aggregates() in nodeAgg.c, and
compare that to combine_aggregates(), which is used when combineStates
== true. Notice that only advance_aggregates() bothers with the
aggfilter clause.

I think you're wrong. The Output line says what that node outputs,
not how it's computed. And a FinalizeAggregate on top of a
PartialAggregate produces the same output as an Aggregate.

The most basic thing I can think of to rationalise my thinking for this is:

# create table v (v varchar);
# create view v_v as select lower(v) v from v;
# explain verbose select upper(v) from v_v;
QUERY PLAN
-------------------------------------------------------------
Seq Scan on public.v (cost=0.00..30.40 rows=1360 width=32)
Output: upper(lower((v.v)::text))
(2 rows)

It seems that you're proposing that the aggregate equivalence of this should be;

QUERY PLAN
-------------------------------------------------------------
Seq Scan on public.v (cost=0.00..30.40 rows=1360 width=32)
Output: upper((v)::text)
(2 rows)

which to me seems wrong, as it's hiding the fact that lower() was called.

My arguments don't seem to be holding much weight, so I'll back down,
although it would still be interesting to hear what others have to say
about this.

In any case, thanks for stepping up to fix this. I'm sure what you're
proposing will be much better than what's there at the moment.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

Robert Haas

robertmhaas@gmail.com

over 9 years ago

In reply to: David Rowley (#16)

Re: EXPLAIN VERBOSE with parallel Aggregate

On Fri, Apr 22, 2016 at 10:11 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

The most basic thing I can think of to rationalise my thinking for this is:

# create table v (v varchar);
# create view v_v as select lower(v) v from v;
# explain verbose select upper(v) from v_v;
QUERY PLAN
-------------------------------------------------------------
Seq Scan on public.v (cost=0.00..30.40 rows=1360 width=32)
Output: upper(lower((v.v)::text))
(2 rows)

It seems that you're proposing that the aggregate equivalence of this should be;

QUERY PLAN
-------------------------------------------------------------
Seq Scan on public.v (cost=0.00..30.40 rows=1360 width=32)
Output: upper((v)::text)
(2 rows)

which to me seems wrong, as it's hiding the fact that lower() was called.

My arguments don't seem to be holding much weight, so I'll back down,
although it would still be interesting to hear what others have to say
about this.

Yeah, I'd be happy to have more people chime in. I think your example
is interesting, but it doesn't persuade me, because the rule has
always been that EXPLAIN shows the output *columns*, not the output
*rows*. The disappearance of some rows doesn't change the list of
output columns. For scans and joins this rule is easy to apply; for
aggregates, where many rows become one, less so. Some of the existing
choices there are certainly arguable, like the fact that FILTER is
shown anywhere at all, which seems like an artifact to me. But I
think that now is not the time to rethink those decisions.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

Robert Haas

robertmhaas@gmail.com

over 9 years ago

In reply to: Robert Haas (#17)

1 attachment(s)

Re: EXPLAIN VERBOSE with parallel Aggregate

On Mon, Apr 25, 2016 at 1:03 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Yeah, I'd be happy to have more people chime in. I think your example
is interesting, but it doesn't persuade me, because the rule has
always been that EXPLAIN shows the output *columns*, not the output
*rows*. The disappearance of some rows doesn't change the list of
output columns. For scans and joins this rule is easy to apply; for
aggregates, where many rows become one, less so. Some of the existing
choices there are certainly arguable, like the fact that FILTER is
shown anywhere at all, which seems like an artifact to me. But I
think that now is not the time to rethink those decisions.

My proposed fix for this issue is attached. Review is welcome;
otherwise, I'll just commit this. The output looks like what I
suggested upthread:

rhaas=# explain verbose select count(*), corr(aid, bid) from pgbench_accounts ;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=742804.22..742804.23 rows=1 width=16)
Output: count(*), corr((aid)::double precision, (bid)::double precision)
-> Gather (cost=742804.00..742804.21 rows=2 width=40)
Output: (PARTIAL count(*)), (PARTIAL corr((aid)::double
precision, (bid)::double precision))
Workers Planned: 2
-> Partial Aggregate (cost=741804.00..741804.01 rows=1 width=40)
Output: PARTIAL count(*), PARTIAL corr((aid)::double
precision, (bid)::double precision)
-> Parallel Seq Scan on public.pgbench_accounts
(cost=0.00..616804.00 rows=12500000 width=8)
Output: aid, bid, abalance, filler
(9 rows)

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

parallel-aggregate-explain.patchapplication/x-patch; name=parallel-aggregate-explain.patchDownload

diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index a21928b..20e38f0 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -1244,6 +1244,8 @@ _copyAggref(const Aggref *from)
 	COPY_NODE_FIELD(aggfilter);
 	COPY_SCALAR_FIELD(aggstar);
 	COPY_SCALAR_FIELD(aggvariadic);
+	COPY_SCALAR_FIELD(aggpartial);
+	COPY_SCALAR_FIELD(aggcombine);
 	COPY_SCALAR_FIELD(aggkind);
 	COPY_SCALAR_FIELD(agglevelsup);
 	COPY_LOCATION_FIELD(location);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 3c6c567..c5ccc42 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -202,6 +202,8 @@ _equalAggref(const Aggref *a, const Aggref *b)
 	COMPARE_NODE_FIELD(aggfilter);
 	COMPARE_SCALAR_FIELD(aggstar);
 	COMPARE_SCALAR_FIELD(aggvariadic);
+	COMPARE_SCALAR_FIELD(aggcombine);
+	COMPARE_SCALAR_FIELD(aggpartial);
 	COMPARE_SCALAR_FIELD(aggkind);
 	COMPARE_SCALAR_FIELD(agglevelsup);
 	COMPARE_LOCATION_FIELD(location);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 5ac7446..c2f0e0f 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1040,6 +1040,8 @@ _outAggref(StringInfo str, const Aggref *node)
 	WRITE_NODE_FIELD(aggfilter);
 	WRITE_BOOL_FIELD(aggstar);
 	WRITE_BOOL_FIELD(aggvariadic);
+	WRITE_BOOL_FIELD(aggcombine);
+	WRITE_BOOL_FIELD(aggpartial);
 	WRITE_CHAR_FIELD(aggkind);
 	WRITE_UINT_FIELD(agglevelsup);
 	WRITE_LOCATION_FIELD(location);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 8059594..6f28047 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -556,6 +556,8 @@ _readAggref(void)
 	READ_NODE_FIELD(aggfilter);
 	READ_BOOL_FIELD(aggstar);
 	READ_BOOL_FIELD(aggvariadic);
+	READ_BOOL_FIELD(aggcombine);
+	READ_BOOL_FIELD(aggpartial);
 	READ_CHAR_FIELD(aggkind);
 	READ_UINT_FIELD(agglevelsup);
 	READ_LOCATION_FIELD(location);
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 5537c14..93fe3a3 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -2100,6 +2100,10 @@ search_indexed_tlist_for_partial_aggref(Aggref *aggref, indexed_tlist *itlist,
 				continue;
 			if (aggref->aggvariadic != tlistaggref->aggvariadic)
 				continue;
+			/*
+			 * it would be harmless to compare aggcombine and aggpartial, but
+			 * it's also unnecessary
+			 */
 			if (aggref->aggkind != tlistaggref->aggkind)
 				continue;
 			if (aggref->agglevelsup != tlistaggref->agglevelsup)
@@ -2463,6 +2467,8 @@ fix_combine_agg_expr_mutator(Node *node, fix_upper_expr_context *context)
 			newtle = makeTargetEntry((Expr *) newvar, 1, NULL, false);
 			newaggref = (Aggref *) copyObject(aggref);
 			newaggref->args = list_make1(newtle);
+			newaggref->aggcombine = true;
+			newaggref->aggpartial = false;
 
 			return (Node *) newaggref;
 		}
diff --git a/src/backend/optimizer/util/tlist.c b/src/backend/optimizer/util/tlist.c
index 4c8c83d..aa2c2f8 100644
--- a/src/backend/optimizer/util/tlist.c
+++ b/src/backend/optimizer/util/tlist.c
@@ -792,6 +792,9 @@ apply_partialaggref_adjustment(PathTarget *target)
 			else
 				newaggref->aggoutputtype = aggform->aggtranstype;
 
+			/* flag it as partial */
+			newaggref->aggpartial = true;
+
 			lfirst(lc) = newaggref;
 
 			ReleaseSysCache(aggTuple);
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 1b8f0ae..5a5d4a6 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -392,6 +392,9 @@ static void get_rule_windowspec(WindowClause *wc, List *targetList,
 					deparse_context *context);
 static char *get_variable(Var *var, int levelsup, bool istoplevel,
 			 deparse_context *context);
+static void get_special_variable(Node *node, deparse_context *context);
+static void resolve_special_varno(Node *node, deparse_context *context,
+					  void (*callback)(Node *, deparse_context *));
 static Node *find_param_referent(Param *param, deparse_context *context,
 					deparse_namespace **dpns_p, ListCell **ancestor_cell_p);
 static void get_parameter(Param *param, deparse_context *context);
@@ -407,7 +410,9 @@ static void get_rule_expr_toplevel(Node *node, deparse_context *context,
 static void get_oper_expr(OpExpr *expr, deparse_context *context);
 static void get_func_expr(FuncExpr *expr, deparse_context *context,
 			  bool showimplicit);
-static void get_agg_expr(Aggref *aggref, deparse_context *context);
+static void get_agg_expr(Aggref *aggref, deparse_context *context,
+			  bool allow_partial_decoration);
+static void get_agg_combine_expr(Node *node, deparse_context *context);
 static void get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context);
 static void get_coercion_expr(Node *arg, deparse_context *context,
 				  Oid resulttype, int32 resulttypmod,
@@ -5864,7 +5869,6 @@ get_utility_query_def(Query *query, deparse_context *context)
 	}
 }
 
-
 /*
  * Display a Var appropriately.
  *
@@ -5917,82 +5921,10 @@ get_variable(Var *var, int levelsup, bool istoplevel, deparse_context *context)
 		colinfo = deparse_columns_fetch(var->varno, dpns);
 		attnum = var->varattno;
 	}
-	else if (var->varno == OUTER_VAR && dpns->outer_tlist)
-	{
-		TargetEntry *tle;
-		deparse_namespace save_dpns;
-
-		tle = get_tle_by_resno(dpns->outer_tlist, var->varattno);
-		if (!tle)
-			elog(ERROR, "bogus varattno for OUTER_VAR var: %d", var->varattno);
-
-		Assert(netlevelsup == 0);
-		push_child_plan(dpns, dpns->outer_planstate, &save_dpns);
-
-		/*
-		 * Force parentheses because our caller probably assumed a Var is a
-		 * simple expression.
-		 */
-		if (!IsA(tle->expr, Var))
-			appendStringInfoChar(buf, '(');
-		get_rule_expr((Node *) tle->expr, context, true);
-		if (!IsA(tle->expr, Var))
-			appendStringInfoChar(buf, ')');
-
-		pop_child_plan(dpns, &save_dpns);
-		return NULL;
-	}
-	else if (var->varno == INNER_VAR && dpns->inner_tlist)
-	{
-		TargetEntry *tle;
-		deparse_namespace save_dpns;
-
-		tle = get_tle_by_resno(dpns->inner_tlist, var->varattno);
-		if (!tle)
-			elog(ERROR, "bogus varattno for INNER_VAR var: %d", var->varattno);
-
-		Assert(netlevelsup == 0);
-		push_child_plan(dpns, dpns->inner_planstate, &save_dpns);
-
-		/*
-		 * Force parentheses because our caller probably assumed a Var is a
-		 * simple expression.
-		 */
-		if (!IsA(tle->expr, Var))
-			appendStringInfoChar(buf, '(');
-		get_rule_expr((Node *) tle->expr, context, true);
-		if (!IsA(tle->expr, Var))
-			appendStringInfoChar(buf, ')');
-
-		pop_child_plan(dpns, &save_dpns);
-		return NULL;
-	}
-	else if (var->varno == INDEX_VAR && dpns->index_tlist)
-	{
-		TargetEntry *tle;
-
-		tle = get_tle_by_resno(dpns->index_tlist, var->varattno);
-		if (!tle)
-			elog(ERROR, "bogus varattno for INDEX_VAR var: %d", var->varattno);
-
-		Assert(netlevelsup == 0);
-
-		/*
-		 * Force parentheses because our caller probably assumed a Var is a
-		 * simple expression.
-		 */
-		if (!IsA(tle->expr, Var))
-			appendStringInfoChar(buf, '(');
-		get_rule_expr((Node *) tle->expr, context, true);
-		if (!IsA(tle->expr, Var))
-			appendStringInfoChar(buf, ')');
-
-		return NULL;
-	}
 	else
 	{
-		elog(ERROR, "bogus varno: %d", var->varno);
-		return NULL;			/* keep compiler quiet */
+		resolve_special_varno((Node *) var, context, get_special_variable);
+		return NULL;
 	}
 
 	/*
@@ -6105,6 +6037,101 @@ get_variable(Var *var, int levelsup, bool istoplevel, deparse_context *context)
 	return attname;
 }
 
+/*
+ * Deparse a Var which references OUTER_VAR, INNER_VAR, or INDEX_VAR.  This
+ * routine is actually a callback for get_special_varno, which handles finding
+ * the correct TargetEntry.  We get the expression contained in that
+ * TargetEntry and just need to deparse it, a job we can throw back on
+ * get_rule_expr.
+ */
+static void
+get_special_variable(Node *node, deparse_context *context)
+{
+	StringInfo	buf = context->buf;
+
+	/*
+	 * Force parentheses because our caller probably assumed a Var is a
+	 * simple expression.
+	 */
+	if (!IsA(node, Var))
+		appendStringInfoChar(buf, '(');
+	get_rule_expr(node, context, true);
+	if (!IsA(node, Var))
+		appendStringInfoChar(buf, ')');
+}
+
+/*
+ * Chase through plan references to special varnos (OUTER_VAR, INNER_VAR,
+ * INDEX_VAR) until we find a real Var or some kind of non-Var node; then,
+ * invoke the callback provided.
+ */
+static void
+resolve_special_varno(Node *node, deparse_context *context,
+					  void (*callback)(Node *, deparse_context *))
+{
+	Var		   *var;
+	deparse_namespace *dpns;
+
+	/* If it's not a Var, invoke the callback. */
+	if (!IsA(node, Var))
+	{
+		callback(node, context);
+		return;
+	}
+
+	/* Find appropriate nesting depth */
+	var = (Var *) node;
+	dpns = (deparse_namespace *) list_nth(context->namespaces,
+										  var->varlevelsup);
+
+	/*
+	 * It's a special RTE, so recurse.
+	 */
+	if (var->varno == OUTER_VAR && dpns->outer_tlist)
+	{
+		TargetEntry *tle;
+		deparse_namespace save_dpns;
+
+		tle = get_tle_by_resno(dpns->outer_tlist, var->varattno);
+		if (!tle)
+			elog(ERROR, "bogus varattno for OUTER_VAR var: %d", var->varattno);
+
+		push_child_plan(dpns, dpns->outer_planstate, &save_dpns);
+		resolve_special_varno((Node *) tle->expr, context, callback);
+		pop_child_plan(dpns, &save_dpns);
+		return;
+	}
+	else if (var->varno == INNER_VAR && dpns->inner_tlist)
+	{
+		TargetEntry *tle;
+		deparse_namespace save_dpns;
+
+		tle = get_tle_by_resno(dpns->inner_tlist, var->varattno);
+		if (!tle)
+			elog(ERROR, "bogus varattno for INNER_VAR var: %d", var->varattno);
+
+		push_child_plan(dpns, dpns->inner_planstate, &save_dpns);
+		resolve_special_varno((Node *) tle->expr, context, callback);
+		pop_child_plan(dpns, &save_dpns);
+		return;
+	}
+	else if (var->varno == INDEX_VAR && dpns->index_tlist)
+	{
+		TargetEntry *tle;
+
+		tle = get_tle_by_resno(dpns->index_tlist, var->varattno);
+		if (!tle)
+			elog(ERROR, "bogus varattno for INDEX_VAR var: %d", var->varattno);
+
+		resolve_special_varno((Node *) tle->expr, context, callback);
+		return;
+	}
+	else if (var->varno < 1 || var->varno > list_length(dpns->rtable))
+		elog(ERROR, "bogus varno: %d", var->varno);
+
+	/* Not special.  Just invoke the callback. */
+	callback(node, context);
+}
 
 /*
  * Get the name of a field of an expression of composite type.  The
@@ -7067,7 +7094,7 @@ get_rule_expr(Node *node, deparse_context *context,
 			break;
 
 		case T_Aggref:
-			get_agg_expr((Aggref *) node, context);
+			get_agg_expr((Aggref *) node, context, true);
 			break;
 
 		case T_GroupingFunc:
@@ -8223,13 +8250,36 @@ get_func_expr(FuncExpr *expr, deparse_context *context,
  * get_agg_expr			- Parse back an Aggref node
  */
 static void
-get_agg_expr(Aggref *aggref, deparse_context *context)
+get_agg_expr(Aggref *aggref, deparse_context *context,
+			 bool allow_partial_decoration)
 {
 	StringInfo	buf = context->buf;
 	Oid			argtypes[FUNC_MAX_ARGS];
 	int			nargs;
 	bool		use_variadic;
 
+	/*
+	 * For a combining aggregate, we look up and deparse the corresponding
+	 * partial aggregate instead.  This is necessary because our input
+	 * argument list has been replaced; the new argument list always has
+	 * just one element, which will point to a partial Aggref that supplies
+	 * us with transition states to combine.
+	 */
+	if (aggref->aggcombine)
+	{
+		TargetEntry *tle = linitial(aggref->args);
+
+		Assert(list_length(aggref->args) == 1);
+		Assert(IsA(tle, TargetEntry));
+		resolve_special_varno((Node *) tle->expr, context,
+							  get_agg_combine_expr);
+		return;
+	}
+
+	/* Decorate partial aggregates, unless suppressed. */
+	if (aggref->aggpartial && allow_partial_decoration)
+		appendStringInfoString(buf, "PARTIAL ");
+
 	/* Extract the argument types as seen by the parser */
 	nargs = get_aggregate_argtypes(aggref, argtypes);
 
@@ -8299,6 +8349,25 @@ get_agg_expr(Aggref *aggref, deparse_context *context)
 }
 
 /*
+ * This is a helper function for get_agg_expr().  It's used when we deparse
+ * a combining Aggref; resolve_special_varno locates the corresponding partial
+ * Aggref and then calls this.
+ */
+static void
+get_agg_combine_expr(Node *node, deparse_context *context)
+{
+	Aggref	   *aggref;
+
+	if (!IsA(node, Aggref))
+		elog(ERROR, "combining Aggref does not point to an Aggref");
+	aggref = (Aggref *) node;
+	if (!aggref->aggpartial)
+		elog(ERROR, "referenced Aggref is not partial");
+
+	get_agg_expr(aggref, context, false);
+}
+
+/*
  * get_windowfunc_expr	- Parse back a WindowFunc node
  */
 static void
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 1ffc0a1..a4bc751 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -280,6 +280,8 @@ typedef struct Aggref
 	bool		aggstar;		/* TRUE if argument list was really '*' */
 	bool		aggvariadic;	/* true if variadic arguments have been
 								 * combined into an array last argument */
+	bool		aggcombine;		/* combining agg; input is a transvalue */
+	bool		aggpartial;		/* partial agg; output is a transvalue */
 	char		aggkind;		/* aggregate kind (see pg_aggregate.h) */
 	Index		agglevelsup;	/* > 0 if agg belongs to outer query */
 	int			location;		/* token location, or -1 if unknown */

#19

Tom Lane

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: Robert Haas (#18)

Re: EXPLAIN VERBOSE with parallel Aggregate

Robert Haas <robertmhaas@gmail.com> writes:

My proposed fix for this issue is attached. Review is welcome;
otherwise, I'll just commit this. The output looks like what I
suggested upthread:

I haven't read the patch, but the sample output looks sane.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

David Rowley

david.rowley@2ndquadrant.com

over 9 years ago

In reply to: Robert Haas (#18)

1 attachment(s)

Re: EXPLAIN VERBOSE with parallel Aggregate

On 27 April 2016 at 08:46, Robert Haas <robertmhaas@gmail.com> wrote:

My proposed fix for this issue is attached. Review is welcome;
otherwise, I'll just commit this. The output looks like what I
suggested upthread:

+ if (!aggref->aggpartial)
+ elog(ERROR, "referenced Aggref is not partial");

I think this is overly restrictive; ruleutils seems a bit out of line
here to say that plans can't have > 1 combine aggregate node.

When coding the combine aggs stuff I had test code to inject
additional combine paths to make sure everything worked as expected.
It did, but it won't if you add these two lines. I'd say just remove
them.

If you apply the attached and execute;

create table t1 (num int not null);
insert into t1 select generate_series(1,2000000);

explain verbose select avg(num) from t1;

You'll get;

ERROR: referenced Aggref is not partial

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

extra_combine_node.patchapplication/octet-stream; name=extra_combine_node.patchDownload

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 6770836..1690230 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -3646,10 +3646,24 @@ create_grouping_paths(PlannerInfo *root,
 												 -1.0);
 
 			if (parse->hasAggs)
+			{
+				AggPath *aggpath = create_agg_path(root,
+											grouped_rel,
+											path,
+											target,
+								parse->groupClause ? AGG_SORTED : AGG_PLAIN,
+											parse->groupClause,
+											(List *) parse->havingQual,
+											&agg_final_costs,
+											dNumGroups,
+											true,
+											false,
+											true);
+
 				add_path(grouped_rel, (Path *)
 							create_agg_path(root,
 											grouped_rel,
-											path,
+											(Path *) aggpath,
 											target,
 								parse->groupClause ? AGG_SORTED : AGG_PLAIN,
 											parse->groupClause,
@@ -3659,6 +3673,8 @@ create_grouping_paths(PlannerInfo *root,
 											true,
 											true,
 											true));
+
+			}
 			else
 				add_path(grouped_rel, (Path *)
 							create_group_path(root,

#21

Robert Haas

robertmhaas@gmail.com

over 9 years ago

In reply to: David Rowley (#20)

Re: EXPLAIN VERBOSE with parallel Aggregate

On Tue, Apr 26, 2016 at 6:44 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

On 27 April 2016 at 08:46, Robert Haas <robertmhaas@gmail.com> wrote:

My proposed fix for this issue is attached. Review is welcome;
otherwise, I'll just commit this. The output looks like what I
suggested upthread:
+ if (!aggref->aggpartial)
+ elog(ERROR, "referenced Aggref is not partial");
I think this is overly restrictive; ruleutils seems a bit out of line
here to say that plans can't have > 1 combine aggregate node.

When coding the combine aggs stuff I had test code to inject
additional combine paths to make sure everything worked as expected.
It did, but it won't if you add these two lines. I'd say just remove
them.

If you apply the attached and execute;

create table t1 (num int not null);
insert into t1 select generate_series(1,2000000);

explain verbose select avg(num) from t1;

You'll get;

ERROR: referenced Aggref is not partial

In this test patch, should aggpath be using partial_grouping_target
rather than target? I feel like we need to use
partial_grouping_target unless finalizeAggs is true.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22

David Rowley

david.rowley@2ndquadrant.com

over 9 years ago

In reply to: Robert Haas (#21)

Re: EXPLAIN VERBOSE with parallel Aggregate

On 27 April 2016 at 12:37, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 26, 2016 at 6:44 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
On 27 April 2016 at 08:46, Robert Haas <robertmhaas@gmail.com> wrote:

My proposed fix for this issue is attached. Review is welcome;
otherwise, I'll just commit this. The output looks like what I
suggested upthread:
+ if (!aggref->aggpartial)
+ elog(ERROR, "referenced Aggref is not partial");
I think this is overly restrictive; ruleutils seems a bit out of line
here to say that plans can't have > 1 combine aggregate node.

When coding the combine aggs stuff I had test code to inject
additional combine paths to make sure everything worked as expected.
It did, but it won't if you add these two lines. I'd say just remove
them.

If you apply the attached and execute;

create table t1 (num int not null);
insert into t1 select generate_series(1,2000000);

explain verbose select avg(num) from t1;

You'll get;

ERROR: referenced Aggref is not partial
In this test patch, should aggpath be using partial_grouping_target
rather than target? I feel like we need to use
partial_grouping_target unless finalizeAggs is true.

Yes. I should also have removed the HAVING clause from the path.

After having changed that, I do still get the error.

I changed it to a NOTICE for now;

# explain verbose select avg(num) from t1 having sum(num) > 0;
NOTICE: referenced Aggref is not partial
NOTICE: referenced Aggref is not partial
QUERY PLAN
---------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=22350.24..22350.25 rows=1 width=32)
Output: avg(num)
Filter: (sum(t1.num) > 0)
-> Partial Aggregate (cost=22350.22..22350.23 rows=1 width=40)
Output: avg(num), sum(num)
-> Gather (cost=22350.00..22350.21 rows=2 width=40)
Output: (PARTIAL avg(num)), (PARTIAL sum(num))
Workers Planned: 2
-> Partial Aggregate (cost=21350.00..21350.01 rows=1 width=40)
Output: PARTIAL avg(num), PARTIAL sum(num)
-> Parallel Seq Scan on public.t1
(cost=0.00..17183.33 rows=833333 width=4)
Output: num
(12 rows)

You can see that the middle aggregate node properly includes the
sum(num) from the having clause, so is using the partial target list.

I'd also have expected the output of both partial nodes to be the
same, i.e. both prefixed with PARTIAL. Is it intended that they don't?
or have I made some other mistake?

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#23

Robert Haas

robertmhaas@gmail.com

over 9 years ago

In reply to: David Rowley (#22)

Re: EXPLAIN VERBOSE with parallel Aggregate

On Tue, Apr 26, 2016 at 9:14 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

I'd also have expected the output of both partial nodes to be the
same, i.e. both prefixed with PARTIAL. Is it intended that they don't?
or have I made some other mistake?

No, that's a defect in the patch. I didn't consider that we need to
support nodes with finalizeAggs = false and combineStates = true,
which is why that ERROR was there. Working on a fix now.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#24

Robert Haas

robertmhaas@gmail.com

over 9 years ago

In reply to: Robert Haas (#23)

1 attachment(s)

Re: EXPLAIN VERBOSE with parallel Aggregate

On Tue, Apr 26, 2016 at 9:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 26, 2016 at 9:14 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

I'd also have expected the output of both partial nodes to be the
same, i.e. both prefixed with PARTIAL. Is it intended that they don't?
or have I made some other mistake?

No, that's a defect in the patch. I didn't consider that we need to
support nodes with finalizeAggs = false and combineStates = true,
which is why that ERROR was there. Working on a fix now.

I think this version should work, provided you use
partial_grouping_target where needed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

parallel-aggregate-explain-v2.patchapplication/x-patch; name=parallel-aggregate-explain-v2.patchDownload

diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index a21928b..20e38f0 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -1244,6 +1244,8 @@ _copyAggref(const Aggref *from)
 	COPY_NODE_FIELD(aggfilter);
 	COPY_SCALAR_FIELD(aggstar);
 	COPY_SCALAR_FIELD(aggvariadic);
+	COPY_SCALAR_FIELD(aggpartial);
+	COPY_SCALAR_FIELD(aggcombine);
 	COPY_SCALAR_FIELD(aggkind);
 	COPY_SCALAR_FIELD(agglevelsup);
 	COPY_LOCATION_FIELD(location);
diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 3c6c567..c5ccc42 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -202,6 +202,8 @@ _equalAggref(const Aggref *a, const Aggref *b)
 	COMPARE_NODE_FIELD(aggfilter);
 	COMPARE_SCALAR_FIELD(aggstar);
 	COMPARE_SCALAR_FIELD(aggvariadic);
+	COMPARE_SCALAR_FIELD(aggcombine);
+	COMPARE_SCALAR_FIELD(aggpartial);
 	COMPARE_SCALAR_FIELD(aggkind);
 	COMPARE_SCALAR_FIELD(agglevelsup);
 	COMPARE_LOCATION_FIELD(location);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index 5ac7446..c2f0e0f 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -1040,6 +1040,8 @@ _outAggref(StringInfo str, const Aggref *node)
 	WRITE_NODE_FIELD(aggfilter);
 	WRITE_BOOL_FIELD(aggstar);
 	WRITE_BOOL_FIELD(aggvariadic);
+	WRITE_BOOL_FIELD(aggcombine);
+	WRITE_BOOL_FIELD(aggpartial);
 	WRITE_CHAR_FIELD(aggkind);
 	WRITE_UINT_FIELD(agglevelsup);
 	WRITE_LOCATION_FIELD(location);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index 8059594..6f28047 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -556,6 +556,8 @@ _readAggref(void)
 	READ_NODE_FIELD(aggfilter);
 	READ_BOOL_FIELD(aggstar);
 	READ_BOOL_FIELD(aggvariadic);
+	READ_BOOL_FIELD(aggcombine);
+	READ_BOOL_FIELD(aggpartial);
 	READ_CHAR_FIELD(aggkind);
 	READ_UINT_FIELD(agglevelsup);
 	READ_LOCATION_FIELD(location);
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 5537c14..266e830 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -2100,6 +2100,10 @@ search_indexed_tlist_for_partial_aggref(Aggref *aggref, indexed_tlist *itlist,
 				continue;
 			if (aggref->aggvariadic != tlistaggref->aggvariadic)
 				continue;
+			/*
+			 * it would be harmless to compare aggcombine and aggpartial, but
+			 * it's also unnecessary
+			 */
 			if (aggref->aggkind != tlistaggref->aggkind)
 				continue;
 			if (aggref->agglevelsup != tlistaggref->agglevelsup)
@@ -2463,6 +2467,7 @@ fix_combine_agg_expr_mutator(Node *node, fix_upper_expr_context *context)
 			newtle = makeTargetEntry((Expr *) newvar, 1, NULL, false);
 			newaggref = (Aggref *) copyObject(aggref);
 			newaggref->args = list_make1(newtle);
+			newaggref->aggcombine = true;
 
 			return (Node *) newaggref;
 		}
diff --git a/src/backend/optimizer/util/tlist.c b/src/backend/optimizer/util/tlist.c
index 4c8c83d..aa2c2f8 100644
--- a/src/backend/optimizer/util/tlist.c
+++ b/src/backend/optimizer/util/tlist.c
@@ -792,6 +792,9 @@ apply_partialaggref_adjustment(PathTarget *target)
 			else
 				newaggref->aggoutputtype = aggform->aggtranstype;
 
+			/* flag it as partial */
+			newaggref->aggpartial = true;
+
 			lfirst(lc) = newaggref;
 
 			ReleaseSysCache(aggTuple);
diff --git a/src/backend/utils/adt/ruleutils.c b/src/backend/utils/adt/ruleutils.c
index 1b8f0ae..75b26dd 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -392,6 +392,11 @@ static void get_rule_windowspec(WindowClause *wc, List *targetList,
 					deparse_context *context);
 static char *get_variable(Var *var, int levelsup, bool istoplevel,
 			 deparse_context *context);
+static void get_special_variable(Node *node, deparse_context *context,
+					 void *private);
+static void resolve_special_varno(Node *node, deparse_context *context,
+					  void *private,
+					  void (*callback) (Node *, deparse_context *, void *));
 static Node *find_param_referent(Param *param, deparse_context *context,
 					deparse_namespace **dpns_p, ListCell **ancestor_cell_p);
 static void get_parameter(Param *param, deparse_context *context);
@@ -407,7 +412,10 @@ static void get_rule_expr_toplevel(Node *node, deparse_context *context,
 static void get_oper_expr(OpExpr *expr, deparse_context *context);
 static void get_func_expr(FuncExpr *expr, deparse_context *context,
 			  bool showimplicit);
-static void get_agg_expr(Aggref *aggref, deparse_context *context);
+static void get_agg_expr(Aggref *aggref, deparse_context *context,
+			 Aggref *original_aggref);
+static void get_agg_combine_expr(Node *node, deparse_context *context,
+					 void *private);
 static void get_windowfunc_expr(WindowFunc *wfunc, deparse_context *context);
 static void get_coercion_expr(Node *arg, deparse_context *context,
 				  Oid resulttype, int32 resulttypmod,
@@ -5864,7 +5872,6 @@ get_utility_query_def(Query *query, deparse_context *context)
 	}
 }
 
-
 /*
  * Display a Var appropriately.
  *
@@ -5917,82 +5924,11 @@ get_variable(Var *var, int levelsup, bool istoplevel, deparse_context *context)
 		colinfo = deparse_columns_fetch(var->varno, dpns);
 		attnum = var->varattno;
 	}
-	else if (var->varno == OUTER_VAR && dpns->outer_tlist)
-	{
-		TargetEntry *tle;
-		deparse_namespace save_dpns;
-
-		tle = get_tle_by_resno(dpns->outer_tlist, var->varattno);
-		if (!tle)
-			elog(ERROR, "bogus varattno for OUTER_VAR var: %d", var->varattno);
-
-		Assert(netlevelsup == 0);
-		push_child_plan(dpns, dpns->outer_planstate, &save_dpns);
-
-		/*
-		 * Force parentheses because our caller probably assumed a Var is a
-		 * simple expression.
-		 */
-		if (!IsA(tle->expr, Var))
-			appendStringInfoChar(buf, '(');
-		get_rule_expr((Node *) tle->expr, context, true);
-		if (!IsA(tle->expr, Var))
-			appendStringInfoChar(buf, ')');
-
-		pop_child_plan(dpns, &save_dpns);
-		return NULL;
-	}
-	else if (var->varno == INNER_VAR && dpns->inner_tlist)
-	{
-		TargetEntry *tle;
-		deparse_namespace save_dpns;
-
-		tle = get_tle_by_resno(dpns->inner_tlist, var->varattno);
-		if (!tle)
-			elog(ERROR, "bogus varattno for INNER_VAR var: %d", var->varattno);
-
-		Assert(netlevelsup == 0);
-		push_child_plan(dpns, dpns->inner_planstate, &save_dpns);
-
-		/*
-		 * Force parentheses because our caller probably assumed a Var is a
-		 * simple expression.
-		 */
-		if (!IsA(tle->expr, Var))
-			appendStringInfoChar(buf, '(');
-		get_rule_expr((Node *) tle->expr, context, true);
-		if (!IsA(tle->expr, Var))
-			appendStringInfoChar(buf, ')');
-
-		pop_child_plan(dpns, &save_dpns);
-		return NULL;
-	}
-	else if (var->varno == INDEX_VAR && dpns->index_tlist)
-	{
-		TargetEntry *tle;
-
-		tle = get_tle_by_resno(dpns->index_tlist, var->varattno);
-		if (!tle)
-			elog(ERROR, "bogus varattno for INDEX_VAR var: %d", var->varattno);
-
-		Assert(netlevelsup == 0);
-
-		/*
-		 * Force parentheses because our caller probably assumed a Var is a
-		 * simple expression.
-		 */
-		if (!IsA(tle->expr, Var))
-			appendStringInfoChar(buf, '(');
-		get_rule_expr((Node *) tle->expr, context, true);
-		if (!IsA(tle->expr, Var))
-			appendStringInfoChar(buf, ')');
-
-		return NULL;
-	}
 	else
 	{
-		elog(ERROR, "bogus varno: %d", var->varno);
-		return NULL;			/* keep compiler quiet */
+		resolve_special_varno((Node *) var, context, NULL,
+							  get_special_variable);
+		return NULL;
 	}
 
 	/*
@@ -6105,6 +6041,101 @@ get_variable(Var *var, int levelsup, bool istoplevel, deparse_context *context)
 	return attname;
 }
 
+/*
+ * Deparse a Var which references OUTER_VAR, INNER_VAR, or INDEX_VAR.  This
+ * routine is actually a callback for get_special_varno, which handles finding
+ * the correct TargetEntry.  We get the expression contained in that
+ * TargetEntry and just need to deparse it, a job we can throw back on
+ * get_rule_expr.
+ */
+static void
+get_special_variable(Node *node, deparse_context *context, void *private)
+{
+	StringInfo	buf = context->buf;
+
+	/*
+	 * Force parentheses because our caller probably assumed a Var is a simple
+	 * expression.
+	 */
+	if (!IsA(node, Var))
+		appendStringInfoChar(buf, '(');
+	get_rule_expr(node, context, true);
+	if (!IsA(node, Var))
+		appendStringInfoChar(buf, ')');
+}
+
+/*
+ * Chase through plan references to special varnos (OUTER_VAR, INNER_VAR,
+ * INDEX_VAR) until we find a real Var or some kind of non-Var node; then,
+ * invoke the callback provided.
+ */
+static void
+resolve_special_varno(Node *node, deparse_context *context, void *private,
+					  void (*callback) (Node *, deparse_context *, void *))
+{
+	Var		   *var;
+	deparse_namespace *dpns;
+
+	/* If it's not a Var, invoke the callback. */
+	if (!IsA(node, Var))
+	{
+		callback(node, context, private);
+		return;
+	}
+
+	/* Find appropriate nesting depth */
+	var = (Var *) node;
+	dpns = (deparse_namespace *) list_nth(context->namespaces,
+										  var->varlevelsup);
+
+	/*
+	 * It's a special RTE, so recurse.
+	 */
+	if (var->varno == OUTER_VAR && dpns->outer_tlist)
+	{
+		TargetEntry *tle;
+		deparse_namespace save_dpns;
+
+		tle = get_tle_by_resno(dpns->outer_tlist, var->varattno);
+		if (!tle)
+			elog(ERROR, "bogus varattno for OUTER_VAR var: %d", var->varattno);
+
+		push_child_plan(dpns, dpns->outer_planstate, &save_dpns);
+		resolve_special_varno((Node *) tle->expr, context, private, callback);
+		pop_child_plan(dpns, &save_dpns);
+		return;
+	}
+	else if (var->varno == INNER_VAR && dpns->inner_tlist)
+	{
+		TargetEntry *tle;
+		deparse_namespace save_dpns;
+
+		tle = get_tle_by_resno(dpns->inner_tlist, var->varattno);
+		if (!tle)
+			elog(ERROR, "bogus varattno for INNER_VAR var: %d", var->varattno);
+
+		push_child_plan(dpns, dpns->inner_planstate, &save_dpns);
+		resolve_special_varno((Node *) tle->expr, context, private, callback);
+		pop_child_plan(dpns, &save_dpns);
+		return;
+	}
+	else if (var->varno == INDEX_VAR && dpns->index_tlist)
+	{
+		TargetEntry *tle;
+
+		tle = get_tle_by_resno(dpns->index_tlist, var->varattno);
+		if (!tle)
+			elog(ERROR, "bogus varattno for INDEX_VAR var: %d", var->varattno);
+
+		resolve_special_varno((Node *) tle->expr, context, private, callback);
+		return;
+	}
+	else if (var->varno < 1 || var->varno > list_length(dpns->rtable))
+		elog(ERROR, "bogus varno: %d", var->varno);
+
+	/* Not special.  Just invoke the callback. */
+	callback(node, context, private);
+}
 
 /*
  * Get the name of a field of an expression of composite type.  The
@@ -7067,7 +7098,7 @@ get_rule_expr(Node *node, deparse_context *context,
 			break;
 
 		case T_Aggref:
-			get_agg_expr((Aggref *) node, context);
+			get_agg_expr((Aggref *) node, context, (Aggref *) node);
 			break;
 
 		case T_GroupingFunc:
@@ -8223,13 +8254,36 @@ get_func_expr(FuncExpr *expr, deparse_context *context,
  * get_agg_expr			- Parse back an Aggref node
  */
 static void
-get_agg_expr(Aggref *aggref, deparse_context *context)
+get_agg_expr(Aggref *aggref, deparse_context *context,
+			 Aggref *original_aggref)
 {
 	StringInfo	buf = context->buf;
 	Oid			argtypes[FUNC_MAX_ARGS];
 	int			nargs;
 	bool		use_variadic;
 
+	/*
+	 * For a combining aggregate, we look up and deparse the corresponding
+	 * partial aggregate instead.  This is necessary because our input
+	 * argument list has been replaced; the new argument list always has just
+	 * one element, which will point to a partial Aggref that supplies us with
+	 * transition states to combine.
+	 */
+	if (aggref->aggcombine)
+	{
+		TargetEntry *tle = linitial(aggref->args);
+
+		Assert(list_length(aggref->args) == 1);
+		Assert(IsA(tle, TargetEntry));
+		resolve_special_varno((Node *) tle->expr, context, original_aggref,
+							  get_agg_combine_expr);
+		return;
+	}
+
+	/* Mark as PARTIAL, if appropriate. */
+	if (original_aggref->aggpartial)
+		appendStringInfoString(buf, "PARTIAL ");
+
 	/* Extract the argument types as seen by the parser */
 	nargs = get_aggregate_argtypes(aggref, argtypes);
 
@@ -8299,6 +8353,24 @@ get_agg_expr(Aggref *aggref, deparse_context *context)
 }
 
 /*
+ * This is a helper function for get_agg_expr().  It's used when we deparse
+ * a combining Aggref; resolve_special_varno locates the corresponding partial
+ * Aggref and then calls this.
+ */
+static void
+get_agg_combine_expr(Node *node, deparse_context *context, void *private)
+{
+	Aggref	   *aggref;
+	Aggref	   *original_aggref = private;
+
+	if (!IsA(node, Aggref))
+		elog(ERROR, "combining Aggref does not point to an Aggref");
+
+	aggref = (Aggref *) node;
+	get_agg_expr(aggref, context, original_aggref);
+}
+
+/*
  * get_windowfunc_expr	- Parse back a WindowFunc node
  */
 static void
diff --git a/src/include/nodes/primnodes.h b/src/include/nodes/primnodes.h
index 1ffc0a1..a4bc751 100644
--- a/src/include/nodes/primnodes.h
+++ b/src/include/nodes/primnodes.h
@@ -280,6 +280,8 @@ typedef struct Aggref
 	bool		aggstar;		/* TRUE if argument list was really '*' */
 	bool		aggvariadic;	/* true if variadic arguments have been
 								 * combined into an array last argument */
+	bool		aggcombine;		/* combining agg; input is a transvalue */
+	bool		aggpartial;		/* partial agg; output is a transvalue */
 	char		aggkind;		/* aggregate kind (see pg_aggregate.h) */
 	Index		agglevelsup;	/* > 0 if agg belongs to outer query */
 	int			location;		/* token location, or -1 if unknown */

#25

David Rowley

david.rowley@2ndquadrant.com

over 9 years ago

In reply to: Robert Haas (#24)

Re: EXPLAIN VERBOSE with parallel Aggregate

On 27 April 2016 at 14:30, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 26, 2016 at 9:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 26, 2016 at 9:14 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

I'd also have expected the output of both partial nodes to be the
same, i.e. both prefixed with PARTIAL. Is it intended that they don't?
or have I made some other mistake?

No, that's a defect in the patch. I didn't consider that we need to
support nodes with finalizeAggs = false and combineStates = true,
which is why that ERROR was there. Working on a fix now.

I think this version should work, provided you use
partial_grouping_target where needed.

+static void get_special_variable(Node *node, deparse_context *context,
+ void *private);

"private" is reserved in C++? I understood we want our C code to
compile as C++ too, right? or did I get my wires crossed somewhere?

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#26

Andres Freund

andres@anarazel.de

over 9 years ago

In reply to: David Rowley (#25)

Re: EXPLAIN VERBOSE with parallel Aggregate

On 2016-04-27 14:57:24 +1200, David Rowley wrote:

"private" is reserved in C++? I understood we want our C code to
compile as C++ too, right? or did I get my wires crossed somewhere?

Just headers. Our C code unfortunately is very far away from being C++
compliant.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#27

Robert Haas

robertmhaas@gmail.com

over 9 years ago

In reply to: David Rowley (#25)

Re: EXPLAIN VERBOSE with parallel Aggregate

On Tue, Apr 26, 2016 at 10:57 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

On 27 April 2016 at 14:30, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 26, 2016 at 9:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 26, 2016 at 9:14 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

I'd also have expected the output of both partial nodes to be the
same, i.e. both prefixed with PARTIAL. Is it intended that they don't?
or have I made some other mistake?

No, that's a defect in the patch. I didn't consider that we need to
support nodes with finalizeAggs = false and combineStates = true,
which is why that ERROR was there. Working on a fix now.

I think this version should work, provided you use
partial_grouping_target where needed.
+static void get_special_variable(Node *node, deparse_context *context,
+ void *private);
"private" is reserved in C++? I understood we want our C code to
compile as C++ too, right? or did I get my wires crossed somewhere?

I can call it something other than "private", if you have a
suggestion; normally I would have used "context", but that's already
taken in this case. private_context would work, I guess.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#28

David Rowley

david.rowley@2ndquadrant.com

over 9 years ago

In reply to: Robert Haas (#27)

1 attachment(s)

Re: EXPLAIN VERBOSE with parallel Aggregate

On 27 April 2016 at 15:12, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 26, 2016 at 10:57 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
On 27 April 2016 at 14:30, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 26, 2016 at 9:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 26, 2016 at 9:14 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

I'd also have expected the output of both partial nodes to be the
same, i.e. both prefixed with PARTIAL. Is it intended that they don't?
or have I made some other mistake?

No, that's a defect in the patch. I didn't consider that we need to
support nodes with finalizeAggs = false and combineStates = true,
which is why that ERROR was there. Working on a fix now.

I think this version should work, provided you use
partial_grouping_target where needed.
+static void get_special_variable(Node *node, deparse_context *context,
+ void *private);
"private" is reserved in C++? I understood we want our C code to
compile as C++ too, right? or did I get my wires crossed somewhere?
I can call it something other than "private", if you have a
suggestion; normally I would have used "context", but that's already
taken in this case. private_context would work, I guess.

It's fine. After Andres' email I looked and saw many other usages of
C++ keywords in our C code. Perhaps it would be a good idea to name it
something else we wanted to work towards it, but it sounds like it's
not, so probably keep what you've got.

The patch looks good. The only other thing I thought about was perhaps
it would be a good idea to re-add the sanity checks in execQual.c.
Patch for that is attached.

I removed the aggoutputtype check, as I only bothered adding that in
the first place because I lost the aggpartial field in some previous
revision of the parallel aggregate developments. I'd say the
aggpartial check makes it surplus to requirements.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

execQual_sanity_checks.patchapplication/octet-stream; name=execQual_sanity_checks.patchDownload

diff --git a/src/backend/executor/execQual.c b/src/backend/executor/execQual.c
index 4df4a9b..c0ca58b 100644
--- a/src/backend/executor/execQual.c
+++ b/src/backend/executor/execQual.c
@@ -4510,28 +4510,35 @@ ExecInitExpr(Expr *node, PlanState *parent)
 		case T_Aggref:
 			{
 				AggrefExprState *astate = makeNode(AggrefExprState);
+				AggState   *aggstate = (AggState *) parent;
+				Aggref	   *aggref = (Aggref *) node;
 
 				astate->xprstate.evalfunc = (ExprStateEvalFunc) ExecEvalAggref;
-				if (parent && IsA(parent, AggState))
+				if (!aggstate || !IsA(aggstate, AggState))
 				{
-					AggState   *aggstate = (AggState *) parent;
-					Aggref	   *aggref = (Aggref *) node;
-
-					if (aggstate->finalizeAggs &&
-						aggref->aggoutputtype != aggref->aggtype)
-					{
-						/* planner messed up */
-						elog(ERROR, "Aggref aggoutputtype must match aggtype");
-					}
-
-					aggstate->aggs = lcons(astate, aggstate->aggs);
-					aggstate->numaggs++;
+					/* planner messed up */
+					elog(ERROR, "Aggref found in non-Agg plan node");
 				}
-				else
+				if (aggref->aggpartial == aggstate->finalizeAggs)
 				{
 					/* planner messed up */
-					elog(ERROR, "Aggref found in non-Agg plan node");
+					if (aggref->aggpartial)
+						elog(ERROR, "partial Aggref found in finalize agg plan node");
+					else
+						elog(ERROR, "non-partial Aggref found in non-finalize agg plan node");
 				}
+
+				if (aggref->aggcombine != aggstate->combineStates)
+				{
+					/* planner messed up */
+					if (aggref->aggcombine)
+						elog(ERROR, "combine Aggref found in non-combine agg plan node");
+					else
+						elog(ERROR, "non-combine Aggref found in combine agg plan node");
+				}
+
+				aggstate->aggs = lcons(astate, aggstate->aggs);
+				aggstate->numaggs++;
 				state = (ExprState *) astate;
 			}
 			break;

#29

Robert Haas

robertmhaas@gmail.com

over 9 years ago

In reply to: David Rowley (#28)

Re: EXPLAIN VERBOSE with parallel Aggregate

On Tue, Apr 26, 2016 at 11:38 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

On 27 April 2016 at 15:12, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Apr 26, 2016 at 10:57 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:
On 27 April 2016 at 14:30, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 26, 2016 at 9:56 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 26, 2016 at 9:14 PM, David Rowley
<david.rowley@2ndquadrant.com> wrote:

I'd also have expected the output of both partial nodes to be the
same, i.e. both prefixed with PARTIAL. Is it intended that they don't?
or have I made some other mistake?

No, that's a defect in the patch. I didn't consider that we need to
support nodes with finalizeAggs = false and combineStates = true,
which is why that ERROR was there. Working on a fix now.

I think this version should work, provided you use
partial_grouping_target where needed.
+static void get_special_variable(Node *node, deparse_context *context,
+ void *private);
"private" is reserved in C++? I understood we want our C code to
compile as C++ too, right? or did I get my wires crossed somewhere?
I can call it something other than "private", if you have a
suggestion; normally I would have used "context", but that's already
taken in this case. private_context would work, I guess.
It's fine. After Andres' email I looked and saw many other usages of
C++ keywords in our C code. Perhaps it would be a good idea to name it
something else we wanted to work towards it, but it sounds like it's
not, so probably keep what you've got.

The patch looks good. The only other thing I thought about was perhaps
it would be a good idea to re-add the sanity checks in execQual.c.
Patch for that is attached.

I removed the aggoutputtype check, as I only bothered adding that in
the first place because I lost the aggpartial field in some previous
revision of the parallel aggregate developments. I'd say the
aggpartial check makes it surplus to requirements.

OK, committed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#30

David Rowley

david.rowley@2ndquadrant.com

over 9 years ago

In reply to: Robert Haas (#29)

Re: EXPLAIN VERBOSE with parallel Aggregate

On 28 April 2016 at 04:08, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, Apr 26, 2016 at 11:38 PM, David Rowley

The patch looks good. The only other thing I thought about was perhaps
it would be a good idea to re-add the sanity checks in execQual.c.
Patch for that is attached.

I removed the aggoutputtype check, as I only bothered adding that in
the first place because I lost the aggpartial field in some previous
revision of the parallel aggregate developments. I'd say the
aggpartial check makes it surplus to requirements.

OK, committed.

Great. Thank you.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers