Deduplicate aggregates and transition functions in planner

Started by Heikki Linnakangasover 5 years ago6 messageshackers

heikki.linnakangas@enterprisedb.com

over 5 years ago

Hi,

Currently, ExecInitAgg() performs quite a lot of work, to deduplicate
identical Aggrefs, as well as Aggrefs that can share the same transition
state. That doesn't really belong in the executor, we should perform
that work in the planner. It doesn't change from one invocation of the
plan to another, and it would be nice to reflect the state-sharing in
the plan costs.

Attached is a patch to do that. It adds two new fields to Aggref,
'aggno' and 'aggtransno', to identify the unique aggregate and
transition states. The duplicates are detected, and those filled in,
early in the planning. Aside from those fields, the planner doesn't pass
any other new information to to the executor, so the the executor still
has to do syscache lookups to get the transition, combine etc. functions.

I tried a bigger refactoring at first, to pass more information from the
planner to the executor, but the patch grew really large before I got
very far with it. So as the first step, I think we should apply the
attached patch, and further refactoring can be done after that, if it
seems worthwhile.

There is one known regression failure, in the 'partition_aggregate'
test, which is caused by a plan change in one of the queries. The query
contains a few aggregates, and the planner now detects that some of them
are identical, which changed the cost estimates, making a different plan
look cheaper. That's easy to fix, although I'm not sure yet if we should
accept the new plan, or change the query to still get the old plan.

- Heikki

Andres Freund

andres@anarazel.de

over 5 years ago

In reply to: Heikki Linnakangas (#1)

Re: Deduplicate aggregates and transition functions in planner

Hi,

On 2020-10-28 21:10:41 +0200, Heikki Linnakangas wrote:

Currently, ExecInitAgg() performs quite a lot of work, to deduplicate
identical Aggrefs, as well as Aggrefs that can share the same transition
state. That doesn't really belong in the executor, we should perform that
work in the planner. It doesn't change from one invocation of the plan to
another, and it would be nice to reflect the state-sharing in the plan
costs.

Woo! Very glad to see this tackled.

It wouldn't surprise me to see a small execution time speedup here -
I've seen the load of the aggno show up in profiles.

Attached is a patch to do that. It adds two new fields to Aggref, 'aggno'
and 'aggtransno', to identify the unique aggregate and transition states.
The duplicates are detected, and those filled in, early in the planning.
Aside from those fields, the planner doesn't pass any other new information
to to the executor, so the the executor still has to do syscache lookups to
get the transition, combine etc. functions.

I tried a bigger refactoring at first, to pass more information from the
planner to the executor, but the patch grew really large before I got very
far with it. So as the first step, I think we should apply the attached
patch, and further refactoring can be done after that, if it seems
worthwhile.

Working incrementally makes sense.

@@ -783,14 +783,13 @@ ExecInitExprRec(Expr *node, ExprState *state,
scratch.opcode = EEOP_AGGREF;
scratch.d.aggref.astate = astate;
-				astate->aggref = aggref;
+				astate->aggno = aggref->aggno;
if (state->parent && IsA(state->parent, AggState))
{
AggState *aggstate = (AggState *) state->parent;
-					aggstate->aggs = lappend(aggstate->aggs, astate);
-					aggstate->numaggs++;
+					aggstate->aggs = lappend(aggstate->aggs, aggref);

Hm. Why is aggstate->aggs still built during expression initialization?
Imo that's a pretty huge wart that also introduces more
order-of-operation brittleness to executor startup.

Since AggRef now knows its aggno, the state for EEOP_AGGREF should be
changed to be just an int, instead of a pointer to AggrefExprState..

@@ -3432,8 +3426,18 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
/*
* We should now have found all Aggrefs in the targetlist and quals.
*/
-	numaggs = aggstate->numaggs;
-	Assert(numaggs == list_length(aggstate->aggs));
+	numaggrefs = list_length(aggstate->aggs);
+	max_aggno = -1;
+	max_transno = -1;
+	foreach(l, aggstate->aggs)
+	{
+		Aggref	   *aggref = (Aggref *) lfirst(l);
+
+		max_aggno = Max(max_aggno, aggref->aggno);
+		max_transno = Max(max_transno, aggref->aggtransno);
+	}
+	numaggs = max_aggno + 1;
+	numtrans = max_transno + 1;

We must have previously determined this, why don't we stash it in struct
Agg?

--- a/src/backend/jit/llvm/llvmjit_expr.c
+++ b/src/backend/jit/llvm/llvmjit_expr.c
@@ -1850,19 +1850,11 @@ llvm_compile_expr(ExprState *state)
case EEOP_AGGREF:
{
AggrefExprState *aggref = op->d.aggref.astate;
-					LLVMValueRef v_aggnop;
LLVMValueRef v_aggno;
LLVMValueRef value,
isnull;

-					/*
-					 * At this point aggref->aggno is not yet set (it's set up
-					 * in ExecInitAgg() after initializing the expression). So
-					 * load it from memory each time round.
-					 */
-					v_aggnop = l_ptr_const(&aggref->aggno,
-										   l_ptr(LLVMInt32Type()));
-					v_aggno = LLVMBuildLoad(b, v_aggnop, "v_aggno");
+					v_aggno = l_int32_const(aggref->aggno);

Yay!

+/*
+ * get_agg_clause_costs
+ *	  Recursively find the Aggref nodes in an expression tree, and
+ *	  accumulate cost information about them.

Think this comment is out of date now.

Greetings,

Andres Freund

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 5 years ago

In reply to: Andres Freund (#2)

Re: Deduplicate aggregates and transition functions in planner

On 28/10/2020 21:59, Andres Freund wrote:

On 2020-10-28 21:10:41 +0200, Heikki Linnakangas wrote:

Currently, ExecInitAgg() performs quite a lot of work, to deduplicate
identical Aggrefs, as well as Aggrefs that can share the same transition
state. That doesn't really belong in the executor, we should perform that
work in the planner. It doesn't change from one invocation of the plan to
another, and it would be nice to reflect the state-sharing in the plan
costs.

Woo! Very glad to see this tackled.

It wouldn't surprise me to see a small execution time speedup here -
I've seen the load of the aggno show up in profiles.

I think you'd be hard-pressed to find a real-life query where it
matters. But if you don't care about real life:

regression=# do $$
begin
for i in 1..100000 loop
perform sum(g), sum(g+0), sum(g+1), sum(g+2), sum(g+3), sum(g+4),
sum(g+5), sum(g+6), sum(g+7), sum(g+8), sum(g+9), sum(g+10) from
generate_series(1,1) g;
end loop;
end;
$$;
DO
Time: 1282.701 ms (00:01.283)

vs.

Time: 860.323 ms

with the patch.

@@ -783,14 +783,13 @@ ExecInitExprRec(Expr *node, ExprState *state,
scratch.opcode = EEOP_AGGREF;
scratch.d.aggref.astate = astate;
-				astate->aggref = aggref;
+				astate->aggno = aggref->aggno;
if (state->parent && IsA(state->parent, AggState))
{
AggState *aggstate = (AggState *) state->parent;
-					aggstate->aggs = lappend(aggstate->aggs, astate);
-					aggstate->numaggs++;
+					aggstate->aggs = lappend(aggstate->aggs, aggref);
Hm. Why is aggstate->aggs still built during expression initialization?
Imo that's a pretty huge wart that also introduces more
order-of-operation brittleness to executor startup.

The Agg node itself doesn't include any information about the aggregates
and transition functions. Because of that, ExecInitAgg needs a
"representive" Aggref for each transition state and agg, to initialize
the per-trans and per-agg structs. The expression initialization makes
those Aggrefs available for ExecInitAgg.

Instead of collecting all the Aggrefs in a list, ExecInitExprRec() could
set each representative Aggref directly in the right per-trans and
per-agg struct, based on the 'aggno' and 'aggtransno' fields. That
requires initializing the per-trans and per-agg arrays earlier, and for
that, we would need to store the # of aggs and transition states in the
Agg node, like you also suggested. Certainly doable, but on the whole,
it didn't really seem better to me. Attached is a patch, to demonstrate
what that looks like, on top of the main patch. It's not complete,
there's at least one case with hash-DISTINCT for queries like "SELECT
DISTINCT aggregate(x) ..." where the planner creates an Agg for the
DISTINCT without aggregates, but the code currently passes numAggs=1 to
the executor. Some further changes would be needed in the planner, to
mark the AggPath generated for deduplication differently from the
AggPaths created for aggregation. Again that's doable, but on the whole
I prefer the approach to scan the Aggrefs in executor startup, for now.

I'd like to get rid of the "representative Aggrefs" altogether, and pass
information about the transition and final functions from planner to
executor in some other form. But that's exactly what got me into the
refactoring that was ballooning out of hand that I mentioned.

- Heikki

Andres Freund

andres@anarazel.de

over 5 years ago

In reply to: Heikki Linnakangas (#3)

Re: Deduplicate aggregates and transition functions in planner

Hi,

On 2020-10-29 10:17:20 +0200, Heikki Linnakangas wrote:

On 28/10/2020 21:59, Andres Freund wrote:

It wouldn't surprise me to see a small execution time speedup here -
I've seen the load of the aggno show up in profiles.

I think you'd be hard-pressed to find a real-life query where it
matters. But if you don't care about real life:

I was actually thinking about a different angle - that the evaluation of
an Aggref can be faster, because we need less indirection to find the
aggno. As you have already implemented for the JITed code, but removing
it for the expression code looks easy enough too. You'd need a lot of
groups and presumably a fair number of Aggrefs to see it.

Attached is a quick version of what I am thinking wrt AggrefExprState.

@@ -783,14 +783,13 @@ ExecInitExprRec(Expr *node, ExprState *state,
scratch.opcode = EEOP_AGGREF;
scratch.d.aggref.astate = astate;
-				astate->aggref = aggref;
+				astate->aggno = aggref->aggno;
if (state->parent && IsA(state->parent, AggState))
{
AggState   *aggstate = (AggState *) state->parent;
-					aggstate->aggs = lappend(aggstate->aggs, astate);
-					aggstate->numaggs++;
+					aggstate->aggs = lappend(aggstate->aggs, aggref);
Hm. Why is aggstate->aggs still built during expression initialization?
Imo that's a pretty huge wart that also introduces more
order-of-operation brittleness to executor startup.
The Agg node itself doesn't include any information about the aggregates and
transition functions. Because of that, ExecInitAgg needs a "representive"
Aggref for each transition state and agg, to initialize the per-trans and
per-agg structs. The expression initialization makes those Aggrefs available
for ExecInitAgg.

Instead of collecting all the Aggrefs in a list, ExecInitExprRec() could set
each representative Aggref directly in the right per-trans and per-agg
struct, based on the 'aggno' and 'aggtransno' fields.

Hold on a second: To me the question is why is it the right design that
the Agg node doesn't have the information about "aggregates and
transition functions"? Agg e.g. already does directly contains the group
keys...

My concern wouldn't really be addressed if we replace the lappend() in
ExecInitExprRec() with setting something "directly in the right
per-trans...". I think it's structurally wrong to have to discover
Aggrefs at execution time at all.

Perhaps the easiest incremental step would be to have something like a
CookedAggref { int aggno; } and then just store the Aggref nodes in
Agg->aggs, with aggno referencing that...

I'd like to get rid of the "representative Aggrefs" altogether, and pass
information about the transition and final functions from planner to
executor in some other form. But that's exactly what got me into the
refactoring that was ballooning out of hand that I mentioned.

Fair.

Greetings,

Andres Freund

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 5 years ago

In reply to: Andres Freund (#4)

Re: Deduplicate aggregates and transition functions in planner

On 29/10/2020 19:48, Andres Freund wrote:

On 2020-10-29 10:17:20 +0200, Heikki Linnakangas wrote:

On 28/10/2020 21:59, Andres Freund wrote:

It wouldn't surprise me to see a small execution time speedup here -
I've seen the load of the aggno show up in profiles.

I think you'd be hard-pressed to find a real-life query where it
matters. But if you don't care about real life:

I was actually thinking about a different angle - that the evaluation of
an Aggref can be faster, because we need less indirection to find the
aggno. As you have already implemented for the JITed code, but removing
it for the expression code looks easy enough too. You'd need a lot of
groups and presumably a fair number of Aggrefs to see it.

Attached is a quick version of what I am thinking wrt AggrefExprState.

Ah, I see, makes sense.

The Agg node itself doesn't include any information about the aggregates and
transition functions. Because of that, ExecInitAgg needs a "representive"
Aggref for each transition state and agg, to initialize the per-trans and
per-agg structs. The expression initialization makes those Aggrefs available
for ExecInitAgg.

Instead of collecting all the Aggrefs in a list, ExecInitExprRec() could set
each representative Aggref directly in the right per-trans and per-agg
struct, based on the 'aggno' and 'aggtransno' fields.

Hold on a second: To me the question is why is it the right design that
the Agg node doesn't have the information about "aggregates and
transition functions"? Agg e.g. already does directly contains the group
keys...

My concern wouldn't really be addressed if we replace the lappend() in
ExecInitExprRec() with setting something "directly in the right
per-trans...". I think it's structurally wrong to have to discover
Aggrefs at execution time at all.

Perhaps the easiest incremental step would be to have something like a
CookedAggref { int aggno; } and then just store the Aggref nodes in
Agg->aggs, with aggno referencing that...

I started hacking on that CookedAggref approach, but it wasn't as simple
as it seemed. I tried to replace the Aggrefs with CookedAggrefs in
setrefs.c, but when set_plan_references() replaces expressions with Vars
referring to the output of a subnode, it needs to be able to match an
Aggref at an upper node to a CookedAggref on the node below.
Furthermore, the deparsing code in ruleutils.c needs to be able to find
the original Aggrefs, in order to print them nicely. All of that is
solvable, I'm sure, but it's not trivial. And it's new code that mostly
builds on top of attached patch, so I think that can be done separately
later, it doesn't need to block this patch.

So barring objections, I'm going to push the attached updated patch that
includes the removal of AggrefExprState, and leave CookedAggrefs or
other further refactorings for the future.

- Heikki

Heikki Linnakangas

heikki.linnakangas@enterprisedb.com

over 5 years ago

In reply to: Heikki Linnakangas (#5)

Re: Deduplicate aggregates and transition functions in planner

On 19/11/2020 12:38, Heikki Linnakangas wrote:

So barring objections, I'm going to push the attached updated patch that
includes the removal of AggrefExprState, and leave CookedAggrefs or
other further refactorings for the future.

Done. Thanks!

- Heikki

Deduplicate aggregates and transition functions in planner

Attachments:

Attachments:

Attachments:

Attachments: