Special-case executor expression steps for common combinations
The attached patch adds special-case expression steps for common sets of steps
in the executor to shave a few cycles off during execution, and make the JIT
generated code simpler.
* Adds EEOP_FUNCEXPR_STRICT_1 and EEOP_FUNCEXPR_STRICT_2 for function calls of
strict functions with 1 or 2 arguments (EEOP_FUNCEXPR_STRICT remains used for
2 arguments).
* Adds EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1 which is a special case for the
common case of one arg aggs.
* Replace EEOP_DONE with EEOP_DONE_RETURN and EEOP_DONE_NO_RETURN to be able to
skip extra setup for steps which are only interested in the side effects.
Stressing the EEOP_FUNCEXPR_STRICT_* steps specifically shows a 1.5%
improvement and pgbench over the branch shows a ~1% improvement in TPS (both
measured over 6 runs with outliers removed).
EEOP_FUNCEXPR_STRICT_* (10M iterations):
master : (7503.317, 7553.691, 7634.524)
patched : (7422.756, 7455.120, 7492.393)
pgbench:
master : (3653.83, 3792.97, 3863.70)
patched : (3743.04, 3830.02, 3869.80)
This patch was extracted from a larger body of work from Andres [0]/messages/by-id/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de aiming at
providing the necessary executor infrastructure for making JIT expression
caching possible. This patch, and more which are to be submitted, is however
separate in the sense that it is not part of the infrastructure, it's an
improvements on its own.
Thoughts?
--
Daniel Gustafsson
[0]: /messages/by-id/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
Attachments:
v1-0001-Add-fast-path-expression-steps-for-common-combina.patchapplication/octet-stream; name=v1-0001-Add-fast-path-expression-steps-for-common-combina.patch; x-unix-mode=0644Download
From 0935adcda7f9bc62174204f3decb9283b93cfd5a Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Thu, 12 Oct 2023 11:33:27 +0200
Subject: [PATCH v1] Add fast-path expression steps for common combinations
This adds special-case expression steps for common sets of steps
in the executor in order to optimize evaluation.
* EEOP_FUNCEXPR_STRICT_1 and EEOP_FUNCEXPR_STRICT_2 for function
calls of strict functions with one or two arguments (the older
EEOP_FUNCEXPR_STRICT op remains used for > two arguments).
* Adds EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1 which is a special case
for the common case of one arg aggs.
* Replace EEOP_DONE with EEOP_DONE_RETURN and EEOP_DONE_NO_RETURN
to be able to skip extra setup for steps which are only using
the side effects.
This patch was extracted from a larger body of work providing the
necessary executor infrastructure for JIT expression caching.
Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
Discussion: https://postgr.es/m/TBD
---
src/backend/executor/README | 11 ++-
src/backend/executor/execExpr.c | 33 ++++++---
src/backend/executor/execExprInterp.c | 99 ++++++++++++++++++++++++---
src/backend/executor/nodeAgg.c | 7 +-
src/backend/jit/llvm/llvmjit_expr.c | 13 +++-
src/include/executor/execExpr.h | 14 ++--
src/include/executor/executor.h | 52 +++++++++++++-
7 files changed, 192 insertions(+), 37 deletions(-)
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 642d63be61..54f4782f31 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -133,9 +133,14 @@ is used by the function evaluation step, thus avoiding extra work to copy
the result values around.
The last entry in a completed ExprState->steps array is always an
-EEOP_DONE step; this removes the need to test for end-of-array while
-iterating. Also, if the expression contains any variable references (to
-user columns of the ExprContext's INNER, OUTER, or SCAN tuples), the steps
+EEOP_DONE_RETURN or EEOP_DONE_NO_RETURN step; this removes the need to
+test for end-of-array while iterating. The former is used when the
+expression returns a value directly, the latter when side-effects of
+expression initialization are the goal (e.g. for projection or
+aggregate transition value computation).
+
+Also, if the expression contains any variable references (to user
+columns of the ExprContext's INNER, OUTER, or SCAN tuples), the steps
array begins with EEOP_*_FETCHSOME steps that ensure that the relevant
tuples have been deconstructed to make the required columns directly
available (cf. slot_getsomeattrs()). This allows individual Var-fetching
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index 2c62b0c9c8..84a9ecec02 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -8,7 +8,7 @@
* using ExecInitExpr() et al. This converts the tree into a flat array
* of ExprEvalSteps, which may be thought of as instructions in a program.
* At runtime, we'll execute steps, starting with the first, until we reach
- * an EEOP_DONE opcode.
+ * an EEOP_DONE_{RETURN|NO_RETURN} opcode.
*
* This file contains the "compilation" logic. It is independent of the
* specific execution technology we use (switch statement, computed goto,
@@ -147,7 +147,7 @@ ExecInitExpr(Expr *node, PlanState *parent)
ExecInitExprRec(node, state, &state->resvalue, &state->resnull);
/* Finally, append a DONE step */
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -184,7 +184,7 @@ ExecInitExprWithParams(Expr *node, ParamListInfo ext_params)
ExecInitExprRec(node, state, &state->resvalue, &state->resnull);
/* Finally, append a DONE step */
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -279,7 +279,7 @@ ExecInitQual(List *qual, PlanState *parent)
* have yielded TRUE, and since its result is stored in the desired output
* location, we're done.
*/
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -474,7 +474,7 @@ ExecBuildProjectionInfo(List *targetList,
}
}
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_NO_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -713,7 +713,7 @@ ExecBuildUpdateProjection(List *targetList,
}
}
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_NO_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -1644,7 +1644,7 @@ ExecInitExprRec(Expr *node, ExprState *state,
else
{
/* Not trivial, so append a DONE step */
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(elemstate, &scratch);
/* and ready the subexpression */
ExecReadyExpr(elemstate);
@@ -2663,7 +2663,15 @@ ExecInitFunc(ExprEvalStep *scratch, Expr *node, List *args, Oid funcid,
if (pgstat_track_functions <= flinfo->fn_stats)
{
if (flinfo->fn_strict && nargs > 0)
- scratch->opcode = EEOP_FUNCEXPR_STRICT;
+ {
+ /* Choose nargs optimized implementation if available. */
+ if (nargs == 1)
+ scratch->opcode = EEOP_FUNCEXPR_STRICT_1;
+ else if (nargs == 2)
+ scratch->opcode = EEOP_FUNCEXPR_STRICT_2;
+ else
+ scratch->opcode = EEOP_FUNCEXPR_STRICT;
+ }
else
scratch->opcode = EEOP_FUNCEXPR;
}
@@ -3674,6 +3682,8 @@ ExecBuildAggTrans(AggState *aggstate, AggStatePerPhase phase,
{
if (strictnulls)
scratch.opcode = EEOP_AGG_STRICT_INPUT_CHECK_NULLS;
+ else if (strictargs && pertrans->numTransInputs == 1)
+ scratch.opcode = EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1;
else
scratch.opcode = EEOP_AGG_STRICT_INPUT_CHECK_ARGS;
scratch.d.agg_strict_input_check.nulls = strictnulls;
@@ -3750,6 +3760,7 @@ ExecBuildAggTrans(AggState *aggstate, AggStatePerPhase phase,
as->d.jump.jumpdone = state->steps_len;
}
else if (as->opcode == EEOP_AGG_STRICT_INPUT_CHECK_ARGS ||
+ as->opcode == EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1 ||
as->opcode == EEOP_AGG_STRICT_INPUT_CHECK_NULLS)
{
Assert(as->d.agg_strict_input_check.jumpnull == -1);
@@ -3773,7 +3784,7 @@ ExecBuildAggTrans(AggState *aggstate, AggStatePerPhase phase,
scratch.resvalue = NULL;
scratch.resnull = NULL;
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_NO_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -4037,7 +4048,7 @@ ExecBuildGroupingEqual(TupleDesc ldesc, TupleDesc rdesc,
scratch.resvalue = NULL;
scratch.resnull = NULL;
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -4171,7 +4182,7 @@ ExecBuildParamSetEqual(TupleDesc desc,
scratch.resvalue = NULL;
scratch.resnull = NULL;
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 24c2b60c62..265c201897 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -239,7 +239,8 @@ ExecReadyInterpretedExpr(ExprState *state)
/* Simple validity checks on expression */
Assert(state->steps_len >= 1);
- Assert(state->steps[state->steps_len - 1].opcode == EEOP_DONE);
+ Assert(state->steps[state->steps_len - 1].opcode == EEOP_DONE_RETURN ||
+ state->steps[state->steps_len - 1].opcode == EEOP_DONE_NO_RETURN);
/*
* Don't perform redundant initialization. This is unreachable in current
@@ -314,7 +315,9 @@ ExecReadyInterpretedExpr(ExprState *state)
return;
}
else if (step0 == EEOP_CASE_TESTVAL &&
- step1 == EEOP_FUNCEXPR_STRICT &&
+ (step1 == EEOP_FUNCEXPR_STRICT ||
+ step1 == EEOP_FUNCEXPR_STRICT_1 ||
+ step1 == EEOP_FUNCEXPR_STRICT_2) &&
state->steps[0].d.casetest.value)
{
state->evalfunc_private = (void *) ExecJustApplyFuncToCase;
@@ -405,7 +408,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
*/
#if defined(EEO_USE_COMPUTED_GOTO)
static const void *const dispatch_table[] = {
- &&CASE_EEOP_DONE,
+ &&CASE_EEOP_DONE_RETURN,
+ &&CASE_EEOP_DONE_NO_RETURN,
&&CASE_EEOP_INNER_FETCHSOME,
&&CASE_EEOP_OUTER_FETCHSOME,
&&CASE_EEOP_SCAN_FETCHSOME,
@@ -424,6 +428,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
&&CASE_EEOP_CONST,
&&CASE_EEOP_FUNCEXPR,
&&CASE_EEOP_FUNCEXPR_STRICT,
+ &&CASE_EEOP_FUNCEXPR_STRICT_1,
+ &&CASE_EEOP_FUNCEXPR_STRICT_2,
&&CASE_EEOP_FUNCEXPR_FUSAGE,
&&CASE_EEOP_FUNCEXPR_STRICT_FUSAGE,
&&CASE_EEOP_BOOL_AND_STEP_FIRST,
@@ -487,6 +493,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
&&CASE_EEOP_AGG_STRICT_DESERIALIZE,
&&CASE_EEOP_AGG_DESERIALIZE,
&&CASE_EEOP_AGG_STRICT_INPUT_CHECK_ARGS,
+ &&CASE_EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1,
&&CASE_EEOP_AGG_STRICT_INPUT_CHECK_NULLS,
&&CASE_EEOP_AGG_PLAIN_PERGROUP_NULLCHECK,
&&CASE_EEOP_AGG_PLAIN_TRANS_INIT_STRICT_BYVAL,
@@ -524,9 +531,16 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_SWITCH()
{
- EEO_CASE(EEOP_DONE)
+ EEO_CASE(EEOP_DONE_RETURN)
{
- goto out;
+ *isnull = state->resnull;
+ return state->resvalue;
+ }
+
+ EEO_CASE(EEOP_DONE_NO_RETURN)
+ {
+ Assert(isnull == NULL);
+ return 0;
}
EEO_CASE(EEOP_INNER_FETCHSOME)
@@ -738,6 +752,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_NEXT();
}
+ /* strict function call with more than two arguments */
EEO_CASE(EEOP_FUNCEXPR_STRICT)
{
FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
@@ -745,6 +760,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
int nargs = op->d.func.nargs;
Datum d;
+ Assert(nargs > 2);
+
/* strict function, so check for NULL args */
for (int argno = 0; argno < nargs; argno++)
{
@@ -763,6 +780,54 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_NEXT();
}
+ /* strict function call with one argument */
+ EEO_CASE(EEOP_FUNCEXPR_STRICT_1)
+ {
+ FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
+ NullableDatum *args = fcinfo->args;
+
+ Assert(op->d.func.nargs == 1);
+
+ /* strict function, so check for NULL args */
+ if (args[0].isnull)
+ *op->resnull = true;
+ else
+ {
+ Datum d;
+
+ fcinfo->isnull = false;
+ d = op->d.func.fn_addr(fcinfo);
+ *op->resvalue = d;
+ *op->resnull = fcinfo->isnull;
+ }
+
+ EEO_NEXT();
+ }
+
+ /* strict function call with two arguments */
+ EEO_CASE(EEOP_FUNCEXPR_STRICT_2)
+ {
+ FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
+ NullableDatum *args = fcinfo->args;
+
+ Assert(op->d.func.nargs == 2);
+
+ /* strict function, so check for NULL args */
+ if (args[0].isnull || args[1].isnull)
+ *op->resnull = true;
+ else
+ {
+ Datum d;
+
+ fcinfo->isnull = false;
+ d = op->d.func.fn_addr(fcinfo);
+ *op->resvalue = d;
+ *op->resnull = fcinfo->isnull;
+ }
+
+ EEO_NEXT();
+ }
+
EEO_CASE(EEOP_FUNCEXPR_FUSAGE)
{
/* not common enough to inline */
@@ -1625,11 +1690,14 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
* input is not NULL.
*/
+ /* when checking more than one argument */
EEO_CASE(EEOP_AGG_STRICT_INPUT_CHECK_ARGS)
{
NullableDatum *args = op->d.agg_strict_input_check.args;
int nargs = op->d.agg_strict_input_check.nargs;
+ Assert(nargs > 1);
+
for (int argno = 0; argno < nargs; argno++)
{
if (args[argno].isnull)
@@ -1638,6 +1706,19 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_NEXT();
}
+ /* special case for just one argument */
+ EEO_CASE(EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1)
+ {
+ NullableDatum *args = op->d.agg_strict_input_check.args;
+ PG_USED_FOR_ASSERTS_ONLY int nargs = op->d.agg_strict_input_check.nargs;
+
+ Assert(nargs == 1);
+
+ if (args[0].isnull)
+ EEO_JUMP(op->d.agg_strict_input_check.jumpnull);
+ EEO_NEXT();
+ }
+
EEO_CASE(EEOP_AGG_STRICT_INPUT_CHECK_NULLS)
{
bool *nulls = op->d.agg_strict_input_check.nulls;
@@ -1840,13 +1921,13 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
{
/* unreachable */
Assert(false);
- goto out;
+ goto out_error;
}
}
-out:
- *isnull = state->resnull;
- return state->resvalue;
+out_error:
+ pg_unreachable();
+ return (Datum) 0;
}
/*
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index f154f28902..85f2bb970c 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -816,11 +816,8 @@ advance_transition_function(AggState *aggstate,
static void
advance_aggregates(AggState *aggstate)
{
- bool dummynull;
-
- ExecEvalExprSwitchContext(aggstate->phase->evaltrans,
- aggstate->tmpcontext,
- &dummynull);
+ ExecEvalExprNoReturnSwitchContext(aggstate->phase->evaltrans,
+ aggstate->tmpcontext);
}
/*
diff --git a/src/backend/jit/llvm/llvmjit_expr.c b/src/backend/jit/llvm/llvmjit_expr.c
index 4b51aa1ce0..fbf257ddf5 100644
--- a/src/backend/jit/llvm/llvmjit_expr.c
+++ b/src/backend/jit/llvm/llvmjit_expr.c
@@ -251,7 +251,7 @@ llvm_compile_expr(ExprState *state)
switch (opcode)
{
- case EEOP_DONE:
+ case EEOP_DONE_RETURN:
{
LLVMValueRef v_tmpisnull;
LLVMValueRef v_tmpvalue;
@@ -265,6 +265,10 @@ llvm_compile_expr(ExprState *state)
break;
}
+ case EEOP_DONE_NO_RETURN:
+ LLVMBuildRet(b, l_sizet_const(0));
+ break;
+
case EEOP_INNER_FETCHSOME:
case EEOP_OUTER_FETCHSOME:
case EEOP_SCAN_FETCHSOME:
@@ -542,12 +546,16 @@ llvm_compile_expr(ExprState *state)
case EEOP_FUNCEXPR:
case EEOP_FUNCEXPR_STRICT:
+ case EEOP_FUNCEXPR_STRICT_1:
+ case EEOP_FUNCEXPR_STRICT_2:
{
FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
LLVMValueRef v_fcinfo_isnull;
LLVMValueRef v_retval;
- if (opcode == EEOP_FUNCEXPR_STRICT)
+ if (opcode == EEOP_FUNCEXPR_STRICT ||
+ opcode == EEOP_FUNCEXPR_STRICT_1 ||
+ opcode == EEOP_FUNCEXPR_STRICT_2)
{
LLVMBasicBlockRef b_nonull;
LLVMBasicBlockRef *b_checkargnulls;
@@ -1984,6 +1992,7 @@ llvm_compile_expr(ExprState *state)
}
case EEOP_AGG_STRICT_INPUT_CHECK_ARGS:
+ case EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1:
case EEOP_AGG_STRICT_INPUT_CHECK_NULLS:
{
int nargs = op->d.agg_strict_input_check.nargs;
diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h
index 048573c2bc..4e6609dd26 100644
--- a/src/include/executor/execExpr.h
+++ b/src/include/executor/execExpr.h
@@ -64,8 +64,11 @@ typedef struct ExprEvalRowtypeCache
*/
typedef enum ExprEvalOp
{
- /* entire expression has been evaluated completely, return */
- EEOP_DONE,
+ /* entire expression has been evaluated, return value */
+ EEOP_DONE_RETURN,
+
+ /* entire expression has been evaluated, no return value */
+ EEOP_DONE_NO_RETURN,
/* apply slot_getsomeattrs on corresponding tuple slot */
EEOP_INNER_FETCHSOME,
@@ -104,11 +107,13 @@ typedef enum ExprEvalOp
/*
* Evaluate function call (including OpExprs etc). For speed, we
- * distinguish in the opcode whether the function is strict and/or
- * requires usage stats tracking.
+ * distinguish in the opcode whether the function is strict with 1, 2, or
+ * more arguments and/or requires usage stats tracking.
*/
EEOP_FUNCEXPR,
EEOP_FUNCEXPR_STRICT,
+ EEOP_FUNCEXPR_STRICT_1,
+ EEOP_FUNCEXPR_STRICT_2,
EEOP_FUNCEXPR_FUSAGE,
EEOP_FUNCEXPR_STRICT_FUSAGE,
@@ -247,6 +252,7 @@ typedef enum ExprEvalOp
EEOP_AGG_STRICT_DESERIALIZE,
EEOP_AGG_DESERIALIZE,
EEOP_AGG_STRICT_INPUT_CHECK_ARGS,
+ EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1,
EEOP_AGG_STRICT_INPUT_CHECK_NULLS,
EEOP_AGG_PLAIN_PERGROUP_NULLCHECK,
EEOP_AGG_PLAIN_TRANS_INIT_STRICT_BYVAL,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index e1eefb400b..cc321bbf9e 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -337,6 +337,34 @@ ExecEvalExpr(ExprState *state,
}
#endif
+/*
+ * ExecEvalExprNoReturn
+ *
+ * Like ExecEvalExpr(), but for cases where no return value is expected,
+ * because the side-effects of expression evaluation are what's desired. This
+ * is e.g. used for projection and aggregate transition computation.
+
+ * Evaluate expression identified by "state" in the execution context
+ * given by "econtext".
+ *
+ * The caller should already have switched into the temporary memory context
+ * econtext->ecxt_per_tuple_memory. The convenience entry point
+ * ExecEvalExprNoReturnSwitchContext() is provided for callers who don't
+ * prefer to do the switch in an outer loop.
+ */
+#ifndef FRONTEND
+static inline void
+ExecEvalExprNoReturn(ExprState *state,
+ ExprContext *econtext)
+{
+ PG_USED_FOR_ASSERTS_ONLY Datum retDatum;
+
+ retDatum = state->evalfunc(state, econtext, NULL);
+
+ Assert(retDatum == (Datum) 0);
+}
+#endif
+
/*
* ExecEvalExprSwitchContext
*
@@ -358,6 +386,25 @@ ExecEvalExprSwitchContext(ExprState *state,
}
#endif
+/*
+ * ExecEvalExprNoReturnSwitchContext
+ *
+ * Same as ExecEvalExprNoReturn, but get into the right allocation context
+ * explicitly.
+ */
+#ifndef FRONTEND
+static inline void
+ExecEvalExprNoReturnSwitchContext(ExprState *state,
+ ExprContext *econtext)
+{
+ MemoryContext oldContext;
+
+ oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
+ ExecEvalExprNoReturn(state, econtext);
+ MemoryContextSwitchTo(oldContext);
+}
+#endif
+
/*
* ExecProject
*
@@ -377,7 +424,6 @@ ExecProject(ProjectionInfo *projInfo)
ExprContext *econtext = projInfo->pi_exprContext;
ExprState *state = &projInfo->pi_state;
TupleTableSlot *slot = state->resultslot;
- bool isnull;
/*
* Clear any former contents of the result slot. This makes it safe for
@@ -385,8 +431,8 @@ ExecProject(ProjectionInfo *projInfo)
*/
ExecClearTuple(slot);
- /* Run the expression, discarding scalar result from the last column. */
- (void) ExecEvalExprSwitchContext(state, econtext, &isnull);
+ /* Run the expression */
+ ExecEvalExprNoReturnSwitchContext(state, econtext);
/*
* Successfully formed a result row. Mark the result slot as containing a
--
2.32.1 (Apple Git-133)
On 12/10/2023 12:48, Daniel Gustafsson wrote:
The attached patch adds special-case expression steps for common sets of steps
in the executor to shave a few cycles off during execution, and make the JIT
generated code simpler.* Adds EEOP_FUNCEXPR_STRICT_1 and EEOP_FUNCEXPR_STRICT_2 for function calls of
strict functions with 1 or 2 arguments (EEOP_FUNCEXPR_STRICT remains used for2 arguments).
* Adds EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1 which is a special case for the
common case of one arg aggs.
Are these relevant when JITting? I'm a little sad if the JIT compiler
cannot unroll these on its own. Is there something we could do to hint
it, so that it could treat the number of arguments as a constant?
I understand that this can give a small boost in interpreter mode, so
maybe we should do it in any case. But I'd like to know if we're missing
a trick with the JITter, before we mask it with this.
* Replace EEOP_DONE with EEOP_DONE_RETURN and EEOP_DONE_NO_RETURN to be able to
skip extra setup for steps which are only interested in the side effects.
I'm a little surprised if this makes a measurable performance
difference, but sure, why not. It seems nice to be more explicit when
you don't expect a return value.
--
Heikki Linnakangas
Neon (https://neon.tech)
On Thu, 12 Oct 2023 at 22:54, Daniel Gustafsson <daniel@yesql.se> wrote:
EEOP_FUNCEXPR_STRICT_* (10M iterations):
master : (7503.317, 7553.691, 7634.524)
patched : (7422.756, 7455.120, 7492.393)pgbench:
master : (3653.83, 3792.97, 3863.70)
patched : (3743.04, 3830.02, 3869.80)Thoughts?
Did any of these tests compile the expression with JIT?
If not, how does the performance compare for a query that JITs the expression?
David
Hi,
On 2023-10-12 13:24:27 +0300, Heikki Linnakangas wrote:
On 12/10/2023 12:48, Daniel Gustafsson wrote:
The attached patch adds special-case expression steps for common sets of steps
in the executor to shave a few cycles off during execution, and make the JIT
generated code simpler.* Adds EEOP_FUNCEXPR_STRICT_1 and EEOP_FUNCEXPR_STRICT_2 for function calls of
strict functions with 1 or 2 arguments (EEOP_FUNCEXPR_STRICT remains used for2 arguments).
* Adds EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1 which is a special case for the
common case of one arg aggs.Are these relevant when JITting? I'm a little sad if the JIT compiler cannot
unroll these on its own. Is there something we could do to hint it, so that
it could treat the number of arguments as a constant?
I think it's mainly important for interpreted execution.
skip extra setup for steps which are only interested in the side effects.
I'm a little surprised if this makes a measurable performance difference,
but sure, why not. It seems nice to be more explicit when you don't expect a
return value.
IIRC this is more interesting for JIT than the above, because it allows LLVM
to know that the return value isn't needed and thus doesn't need to be
computed.
Greetings,
Andres Freund
On 12 Oct 2023, at 19:52, Andres Freund <andres@anarazel.de> wrote:
On 2023-10-12 13:24:27 +0300, Heikki Linnakangas wrote:On 12/10/2023 12:48, Daniel Gustafsson wrote:
The attached patch adds special-case expression steps for common sets of steps
in the executor to shave a few cycles off during execution, and make the JIT
generated code simpler.* Adds EEOP_FUNCEXPR_STRICT_1 and EEOP_FUNCEXPR_STRICT_2 for function calls of
strict functions with 1 or 2 arguments (EEOP_FUNCEXPR_STRICT remains used for2 arguments).
* Adds EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1 which is a special case for the
common case of one arg aggs.Are these relevant when JITting? I'm a little sad if the JIT compiler cannot
unroll these on its own. Is there something we could do to hint it, so that
it could treat the number of arguments as a constant?I think it's mainly important for interpreted execution.
Agreed.
skip extra setup for steps which are only interested in the side effects.
I'm a little surprised if this makes a measurable performance difference,
but sure, why not. It seems nice to be more explicit when you don't expect a
return value.
Right, performance benefits aside it does improve readability IMHO.
IIRC this is more interesting for JIT than the above, because it allows LLVM
to know that the return value isn't needed and thus doesn't need to be
computed.
Correct, this is important to the JIT code which no longer has to perform two
Loads and one Store in order to get nothing, but can instead fastpath to
building a zero returnvalue.
--
Daniel Gustafsson
On 10/12/23 11:48 AM, Daniel Gustafsson wrote:
Thoughts?
I have looked at the patch and it still applies, builds and passes the
test cases and I personally think these optimizations are pretty much
no-brainers that we should do and it is a pity nobody has had the time
to review this patch.
1) The no-return case should help with the JIT, making jitted code faster.
2) The specialized strict steps helps with many common queries in the
interpreted mode.
The code itself looks really good (great work!) but I have two comments
on it.
1) I think the patch should be split into two. The two different
optimizations are not related at all other than that they create
specialized versions of expressions steps. Having them as separate makes
the commit history easier to read for future developers.
2) We could generate functions which return void rather than NULL and
therefore not have to do a return at all but I am not sure that small
optimization and extra clarity would be worth the hassle. The current
approach with adding Assert() is ok with me. Daniel, what do you think?
Andreas
On 6/20/24 5:22 PM, Andreas Karlsson wrote:
On 10/12/23 11:48 AM, Daniel Gustafsson wrote:
Thoughts?
I have looked at the patch and it still applies, builds and passes the
test cases and I personally think these optimizations are pretty much
no-brainers that we should do and it is a pity nobody has had the time
to review this patch.
Forgot to write that I am planning to also try to do so benchmarks to
see if I can reproduce the speedups. :)
Andreas
On 20 Jun 2024, at 17:22, Andreas Karlsson <andreas@proxel.se> wrote:
On 10/12/23 11:48 AM, Daniel Gustafsson wrote:
Thoughts?
I have looked at the patch and it still applies, builds and passes the test cases and I personally think these optimizations are pretty much no-brainers that we should do and it is a pity nobody has had the time to review this patch.
1) The no-return case should help with the JIT, making jitted code faster.
2) The specialized strict steps helps with many common queries in the interpreted mode.
The code itself looks really good (great work!) but I have two comments on it.
Thanks for review!
1) I think the patch should be split into two. The two different optimizations are not related at all other than that they create specialized versions of expressions steps. Having them as separate makes the commit history easier to read for future developers.
That's a good point, the attached v2 splits it into two separate commits.
2) We could generate functions which return void rather than NULL and therefore not have to do a return at all but I am not sure that small optimization and extra clarity would be worth the hassle. The current approach with adding Assert() is ok with me. Daniel, what do you think?
I'm not sure that would move the needle enough to warrant the extra complexity.
It could be worth pursuing, but it can be done separately from this.
--
Daniel Gustafsson
Attachments:
v2-0002-Add-special-case-fast-paths-for-strict-functions.patchapplication/octet-stream; name=v2-0002-Add-special-case-fast-paths-for-strict-functions.patch; x-unix-mode=0644Download
From 56247dcb2ac57eb60d0b4891797c9b8e0a16ca67 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Thu, 4 Jul 2024 17:58:23 +0200
Subject: [PATCH v2 2/2] Add special case fast-paths for strict functions
Many STRICT function calls will have one or two arguments, in which
case we can speed up checking for NULL input by avoiding setting up
a loop over the arguments. This adds EEOP_FUNCEXPR_STRICT_1 and the
corresponding EEOP_FUNCEXPR_STRICT_2 for functions with one and two
arguments respectively.
Author: Andres Freund, Daniel Gustafsson
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://postgr.es/m/415721CE-7D2E-4B74-B5D9-1950083BA03E@yesql.se
Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
---
src/backend/executor/execExpr.c | 13 ++++-
src/backend/executor/execExprInterp.c | 74 ++++++++++++++++++++++++++-
src/backend/jit/llvm/llvmjit_expr.c | 7 ++-
src/include/executor/execExpr.h | 7 ++-
4 files changed, 96 insertions(+), 5 deletions(-)
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index 9ca6a3fea3..0fbec71285 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -2701,7 +2701,15 @@ ExecInitFunc(ExprEvalStep *scratch, Expr *node, List *args, Oid funcid,
if (pgstat_track_functions <= flinfo->fn_stats)
{
if (flinfo->fn_strict && nargs > 0)
- scratch->opcode = EEOP_FUNCEXPR_STRICT;
+ {
+ /* Choose nargs optimized implementation if available. */
+ if (nargs == 1)
+ scratch->opcode = EEOP_FUNCEXPR_STRICT_1;
+ else if (nargs == 2)
+ scratch->opcode = EEOP_FUNCEXPR_STRICT_2;
+ else
+ scratch->opcode = EEOP_FUNCEXPR_STRICT;
+ }
else
scratch->opcode = EEOP_FUNCEXPR;
}
@@ -3713,6 +3721,8 @@ ExecBuildAggTrans(AggState *aggstate, AggStatePerPhase phase,
{
if (strictnulls)
scratch.opcode = EEOP_AGG_STRICT_INPUT_CHECK_NULLS;
+ else if (strictargs && pertrans->numTransInputs == 1)
+ scratch.opcode = EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1;
else
scratch.opcode = EEOP_AGG_STRICT_INPUT_CHECK_ARGS;
scratch.d.agg_strict_input_check.nulls = strictnulls;
@@ -3789,6 +3799,7 @@ ExecBuildAggTrans(AggState *aggstate, AggStatePerPhase phase,
as->d.jump.jumpdone = state->steps_len;
}
else if (as->opcode == EEOP_AGG_STRICT_INPUT_CHECK_ARGS ||
+ as->opcode == EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1 ||
as->opcode == EEOP_AGG_STRICT_INPUT_CHECK_NULLS)
{
Assert(as->d.agg_strict_input_check.jumpnull == -1);
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 0b20e2885d..90f37db62e 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -316,7 +316,9 @@ ExecReadyInterpretedExpr(ExprState *state)
return;
}
else if (step0 == EEOP_CASE_TESTVAL &&
- step1 == EEOP_FUNCEXPR_STRICT &&
+ (step1 == EEOP_FUNCEXPR_STRICT ||
+ step1 == EEOP_FUNCEXPR_STRICT_1 ||
+ step1 == EEOP_FUNCEXPR_STRICT_2) &&
state->steps[0].d.casetest.value)
{
state->evalfunc_private = (void *) ExecJustApplyFuncToCase;
@@ -427,6 +429,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
&&CASE_EEOP_CONST,
&&CASE_EEOP_FUNCEXPR,
&&CASE_EEOP_FUNCEXPR_STRICT,
+ &&CASE_EEOP_FUNCEXPR_STRICT_1,
+ &&CASE_EEOP_FUNCEXPR_STRICT_2,
&&CASE_EEOP_FUNCEXPR_FUSAGE,
&&CASE_EEOP_FUNCEXPR_STRICT_FUSAGE,
&&CASE_EEOP_BOOL_AND_STEP_FIRST,
@@ -495,6 +499,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
&&CASE_EEOP_AGG_STRICT_DESERIALIZE,
&&CASE_EEOP_AGG_DESERIALIZE,
&&CASE_EEOP_AGG_STRICT_INPUT_CHECK_ARGS,
+ &&CASE_EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1,
&&CASE_EEOP_AGG_STRICT_INPUT_CHECK_NULLS,
&&CASE_EEOP_AGG_PLAIN_PERGROUP_NULLCHECK,
&&CASE_EEOP_AGG_PLAIN_TRANS_INIT_STRICT_BYVAL,
@@ -753,6 +758,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_NEXT();
}
+ /* strict function call with more than two arguments */
EEO_CASE(EEOP_FUNCEXPR_STRICT)
{
FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
@@ -760,6 +766,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
int nargs = op->d.func.nargs;
Datum d;
+ Assert(nargs > 2);
+
/* strict function, so check for NULL args */
for (int argno = 0; argno < nargs; argno++)
{
@@ -778,6 +786,54 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_NEXT();
}
+ /* strict function call with one argument */
+ EEO_CASE(EEOP_FUNCEXPR_STRICT_1)
+ {
+ FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
+ NullableDatum *args = fcinfo->args;
+
+ Assert(op->d.func.nargs == 1);
+
+ /* strict function, so check for NULL args */
+ if (args[0].isnull)
+ *op->resnull = true;
+ else
+ {
+ Datum d;
+
+ fcinfo->isnull = false;
+ d = op->d.func.fn_addr(fcinfo);
+ *op->resvalue = d;
+ *op->resnull = fcinfo->isnull;
+ }
+
+ EEO_NEXT();
+ }
+
+ /* strict function call with two arguments */
+ EEO_CASE(EEOP_FUNCEXPR_STRICT_2)
+ {
+ FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
+ NullableDatum *args = fcinfo->args;
+
+ Assert(op->d.func.nargs == 2);
+
+ /* strict function, so check for NULL args */
+ if (args[0].isnull || args[1].isnull)
+ *op->resnull = true;
+ else
+ {
+ Datum d;
+
+ fcinfo->isnull = false;
+ d = op->d.func.fn_addr(fcinfo);
+ *op->resvalue = d;
+ *op->resnull = fcinfo->isnull;
+ }
+
+ EEO_NEXT();
+ }
+
EEO_CASE(EEOP_FUNCEXPR_FUSAGE)
{
/* not common enough to inline */
@@ -1679,11 +1735,14 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
* input is not NULL.
*/
+ /* when checking more than one argument */
EEO_CASE(EEOP_AGG_STRICT_INPUT_CHECK_ARGS)
{
NullableDatum *args = op->d.agg_strict_input_check.args;
int nargs = op->d.agg_strict_input_check.nargs;
+ Assert(nargs > 1);
+
for (int argno = 0; argno < nargs; argno++)
{
if (args[argno].isnull)
@@ -1692,6 +1751,19 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_NEXT();
}
+ /* special case for just one argument */
+ EEO_CASE(EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1)
+ {
+ NullableDatum *args = op->d.agg_strict_input_check.args;
+ PG_USED_FOR_ASSERTS_ONLY int nargs = op->d.agg_strict_input_check.nargs;
+
+ Assert(nargs == 1);
+
+ if (args[0].isnull)
+ EEO_JUMP(op->d.agg_strict_input_check.jumpnull);
+ EEO_NEXT();
+ }
+
EEO_CASE(EEOP_AGG_STRICT_INPUT_CHECK_NULLS)
{
bool *nulls = op->d.agg_strict_input_check.nulls;
diff --git a/src/backend/jit/llvm/llvmjit_expr.c b/src/backend/jit/llvm/llvmjit_expr.c
index 4cbbdc72b2..84b044d749 100644
--- a/src/backend/jit/llvm/llvmjit_expr.c
+++ b/src/backend/jit/llvm/llvmjit_expr.c
@@ -590,12 +590,16 @@ llvm_compile_expr(ExprState *state)
case EEOP_FUNCEXPR:
case EEOP_FUNCEXPR_STRICT:
+ case EEOP_FUNCEXPR_STRICT_1:
+ case EEOP_FUNCEXPR_STRICT_2:
{
FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
LLVMValueRef v_fcinfo_isnull;
LLVMValueRef v_retval;
- if (opcode == EEOP_FUNCEXPR_STRICT)
+ if (opcode == EEOP_FUNCEXPR_STRICT ||
+ opcode == EEOP_FUNCEXPR_STRICT_1 ||
+ opcode == EEOP_FUNCEXPR_STRICT_2)
{
LLVMBasicBlockRef b_nonull;
LLVMBasicBlockRef *b_checkargnulls;
@@ -2164,6 +2168,7 @@ llvm_compile_expr(ExprState *state)
}
case EEOP_AGG_STRICT_INPUT_CHECK_ARGS:
+ case EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1:
case EEOP_AGG_STRICT_INPUT_CHECK_NULLS:
{
int nargs = op->d.agg_strict_input_check.nargs;
diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h
index 4a7aca6a1a..d26c38dcdf 100644
--- a/src/include/executor/execExpr.h
+++ b/src/include/executor/execExpr.h
@@ -108,11 +108,13 @@ typedef enum ExprEvalOp
/*
* Evaluate function call (including OpExprs etc). For speed, we
- * distinguish in the opcode whether the function is strict and/or
- * requires usage stats tracking.
+ * distinguish in the opcode whether the function is strict with 1, 2, or
+ * more arguments and/or requires usage stats tracking.
*/
EEOP_FUNCEXPR,
EEOP_FUNCEXPR_STRICT,
+ EEOP_FUNCEXPR_STRICT_1,
+ EEOP_FUNCEXPR_STRICT_2,
EEOP_FUNCEXPR_FUSAGE,
EEOP_FUNCEXPR_STRICT_FUSAGE,
@@ -256,6 +258,7 @@ typedef enum ExprEvalOp
EEOP_AGG_STRICT_DESERIALIZE,
EEOP_AGG_DESERIALIZE,
EEOP_AGG_STRICT_INPUT_CHECK_ARGS,
+ EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1,
EEOP_AGG_STRICT_INPUT_CHECK_NULLS,
EEOP_AGG_PLAIN_PERGROUP_NULLCHECK,
EEOP_AGG_PLAIN_TRANS_INIT_STRICT_BYVAL,
--
2.39.3 (Apple Git-146)
v2-0001-Replace-EEOP_DONE-with-special-steps-for-return-n.patchapplication/octet-stream; name=v2-0001-Replace-EEOP_DONE-with-special-steps-for-return-n.patch; x-unix-mode=0644Download
From 959e0b4f16efb75c63d946540b6a261899483fa6 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Thu, 4 Jul 2024 16:55:52 +0200
Subject: [PATCH v2 1/2] Replace EEOP_DONE with special steps for return/no
return
Knowing when the side-effects of an expression is the intended result
of the execution, rather than the returnvalue, is important for being
able generate more efficient JITed code. This replaces EEOP_DONE with
two new steps: EEOP_DONE_RETURN and EEOP_DONE_NO_RETURN. Expressions
which return a value should use the former step; expressions used for
their side-effects which don't return value should use the latter.
Author: Andres Freund, Daniel Gustafsson
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://postgr.es/m/415721CE-7D2E-4B74-B5D9-1950083BA03E@yesql.se
Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
---
src/backend/executor/README | 11 ++++--
src/backend/executor/execExpr.c | 20 +++++------
src/backend/executor/execExprInterp.c | 25 ++++++++-----
src/backend/executor/nodeAgg.c | 7 ++--
src/backend/jit/llvm/llvmjit_expr.c | 6 +++-
src/include/executor/execExpr.h | 7 ++--
src/include/executor/executor.h | 52 +++++++++++++++++++++++++--
7 files changed, 96 insertions(+), 32 deletions(-)
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 642d63be61..54f4782f31 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -133,9 +133,14 @@ is used by the function evaluation step, thus avoiding extra work to copy
the result values around.
The last entry in a completed ExprState->steps array is always an
-EEOP_DONE step; this removes the need to test for end-of-array while
-iterating. Also, if the expression contains any variable references (to
-user columns of the ExprContext's INNER, OUTER, or SCAN tuples), the steps
+EEOP_DONE_RETURN or EEOP_DONE_NO_RETURN step; this removes the need to
+test for end-of-array while iterating. The former is used when the
+expression returns a value directly, the latter when side-effects of
+expression initialization are the goal (e.g. for projection or
+aggregate transition value computation).
+
+Also, if the expression contains any variable references (to user
+columns of the ExprContext's INNER, OUTER, or SCAN tuples), the steps
array begins with EEOP_*_FETCHSOME steps that ensure that the relevant
tuples have been deconstructed to make the required columns directly
available (cf. slot_getsomeattrs()). This allows individual Var-fetching
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index ccd4863778..9ca6a3fea3 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -8,7 +8,7 @@
* using ExecInitExpr() et al. This converts the tree into a flat array
* of ExprEvalSteps, which may be thought of as instructions in a program.
* At runtime, we'll execute steps, starting with the first, until we reach
- * an EEOP_DONE opcode.
+ * an EEOP_DONE_{RETURN|NO_RETURN} opcode.
*
* This file contains the "compilation" logic. It is independent of the
* specific execution technology we use (switch statement, computed goto,
@@ -153,7 +153,7 @@ ExecInitExpr(Expr *node, PlanState *parent)
ExecInitExprRec(node, state, &state->resvalue, &state->resnull);
/* Finally, append a DONE step */
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -190,7 +190,7 @@ ExecInitExprWithParams(Expr *node, ParamListInfo ext_params)
ExecInitExprRec(node, state, &state->resvalue, &state->resnull);
/* Finally, append a DONE step */
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -282,7 +282,7 @@ ExecInitQual(List *qual, PlanState *parent)
* have yielded TRUE, and since its result is stored in the desired output
* location, we're done.
*/
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -477,7 +477,7 @@ ExecBuildProjectionInfo(List *targetList,
}
}
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_NO_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -716,7 +716,7 @@ ExecBuildUpdateProjection(List *targetList,
}
}
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_NO_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -1665,7 +1665,7 @@ ExecInitExprRec(Expr *node, ExprState *state,
else
{
/* Not trivial, so append a DONE step */
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(elemstate, &scratch);
/* and ready the subexpression */
ExecReadyExpr(elemstate);
@@ -3812,7 +3812,7 @@ ExecBuildAggTrans(AggState *aggstate, AggStatePerPhase phase,
scratch.resvalue = NULL;
scratch.resnull = NULL;
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_NO_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -4076,7 +4076,7 @@ ExecBuildGroupingEqual(TupleDesc ldesc, TupleDesc rdesc,
scratch.resvalue = NULL;
scratch.resnull = NULL;
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -4210,7 +4210,7 @@ ExecBuildParamSetEqual(TupleDesc desc,
scratch.resvalue = NULL;
scratch.resnull = NULL;
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index d8735286c4..0b20e2885d 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -240,7 +240,8 @@ ExecReadyInterpretedExpr(ExprState *state)
/* Simple validity checks on expression */
Assert(state->steps_len >= 1);
- Assert(state->steps[state->steps_len - 1].opcode == EEOP_DONE);
+ Assert(state->steps[state->steps_len - 1].opcode == EEOP_DONE_RETURN ||
+ state->steps[state->steps_len - 1].opcode == EEOP_DONE_NO_RETURN);
/*
* Don't perform redundant initialization. This is unreachable in current
@@ -406,7 +407,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
*/
#if defined(EEO_USE_COMPUTED_GOTO)
static const void *const dispatch_table[] = {
- &&CASE_EEOP_DONE,
+ &&CASE_EEOP_DONE_RETURN,
+ &&CASE_EEOP_DONE_NO_RETURN,
&&CASE_EEOP_INNER_FETCHSOME,
&&CASE_EEOP_OUTER_FETCHSOME,
&&CASE_EEOP_SCAN_FETCHSOME,
@@ -530,9 +532,16 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_SWITCH()
{
- EEO_CASE(EEOP_DONE)
+ EEO_CASE(EEOP_DONE_RETURN)
{
- goto out;
+ *isnull = state->resnull;
+ return state->resvalue;
+ }
+
+ EEO_CASE(EEOP_DONE_NO_RETURN)
+ {
+ Assert(isnull == NULL);
+ return 0;
}
EEO_CASE(EEOP_INNER_FETCHSOME)
@@ -1885,13 +1894,13 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
{
/* unreachable */
Assert(false);
- goto out;
+ goto out_error;
}
}
-out:
- *isnull = state->resnull;
- return state->resvalue;
+out_error:
+ pg_unreachable();
+ return (Datum) 0;
}
/*
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 53ead77ece..c8257a23d1 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -815,11 +815,8 @@ advance_transition_function(AggState *aggstate,
static void
advance_aggregates(AggState *aggstate)
{
- bool dummynull;
-
- ExecEvalExprSwitchContext(aggstate->phase->evaltrans,
- aggstate->tmpcontext,
- &dummynull);
+ ExecEvalExprNoReturnSwitchContext(aggstate->phase->evaltrans,
+ aggstate->tmpcontext);
}
/*
diff --git a/src/backend/jit/llvm/llvmjit_expr.c b/src/backend/jit/llvm/llvmjit_expr.c
index 306aea82d3..4cbbdc72b2 100644
--- a/src/backend/jit/llvm/llvmjit_expr.c
+++ b/src/backend/jit/llvm/llvmjit_expr.c
@@ -285,7 +285,7 @@ llvm_compile_expr(ExprState *state)
switch (opcode)
{
- case EEOP_DONE:
+ case EEOP_DONE_RETURN:
{
LLVMValueRef v_tmpisnull;
LLVMValueRef v_tmpvalue;
@@ -299,6 +299,10 @@ llvm_compile_expr(ExprState *state)
break;
}
+ case EEOP_DONE_NO_RETURN:
+ LLVMBuildRet(b, l_sizet_const(0));
+ break;
+
case EEOP_INNER_FETCHSOME:
case EEOP_OUTER_FETCHSOME:
case EEOP_SCAN_FETCHSOME:
diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h
index 55337d4916..4a7aca6a1a 100644
--- a/src/include/executor/execExpr.h
+++ b/src/include/executor/execExpr.h
@@ -65,8 +65,11 @@ typedef struct ExprEvalRowtypeCache
*/
typedef enum ExprEvalOp
{
- /* entire expression has been evaluated completely, return */
- EEOP_DONE,
+ /* entire expression has been evaluated, return value */
+ EEOP_DONE_RETURN,
+
+ /* entire expression has been evaluated, no return value */
+ EEOP_DONE_NO_RETURN,
/* apply slot_getsomeattrs on corresponding tuple slot */
EEOP_INNER_FETCHSOME,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 9770752ea3..39a082e9ad 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -338,6 +338,34 @@ ExecEvalExpr(ExprState *state,
}
#endif
+/*
+ * ExecEvalExprNoReturn
+ *
+ * Like ExecEvalExpr(), but for cases where no return value is expected,
+ * because the side-effects of expression evaluation are what's desired. This
+ * is e.g. used for projection and aggregate transition computation.
+
+ * Evaluate expression identified by "state" in the execution context
+ * given by "econtext".
+ *
+ * The caller should already have switched into the temporary memory context
+ * econtext->ecxt_per_tuple_memory. The convenience entry point
+ * ExecEvalExprNoReturnSwitchContext() is provided for callers who don't
+ * prefer to do the switch in an outer loop.
+ */
+#ifndef FRONTEND
+static inline void
+ExecEvalExprNoReturn(ExprState *state,
+ ExprContext *econtext)
+{
+ PG_USED_FOR_ASSERTS_ONLY Datum retDatum;
+
+ retDatum = state->evalfunc(state, econtext, NULL);
+
+ Assert(retDatum == (Datum) 0);
+}
+#endif
+
/*
* ExecEvalExprSwitchContext
*
@@ -359,6 +387,25 @@ ExecEvalExprSwitchContext(ExprState *state,
}
#endif
+/*
+ * ExecEvalExprNoReturnSwitchContext
+ *
+ * Same as ExecEvalExprNoReturn, but get into the right allocation context
+ * explicitly.
+ */
+#ifndef FRONTEND
+static inline void
+ExecEvalExprNoReturnSwitchContext(ExprState *state,
+ ExprContext *econtext)
+{
+ MemoryContext oldContext;
+
+ oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
+ ExecEvalExprNoReturn(state, econtext);
+ MemoryContextSwitchTo(oldContext);
+}
+#endif
+
/*
* ExecProject
*
@@ -378,7 +425,6 @@ ExecProject(ProjectionInfo *projInfo)
ExprContext *econtext = projInfo->pi_exprContext;
ExprState *state = &projInfo->pi_state;
TupleTableSlot *slot = state->resultslot;
- bool isnull;
/*
* Clear any former contents of the result slot. This makes it safe for
@@ -386,8 +432,8 @@ ExecProject(ProjectionInfo *projInfo)
*/
ExecClearTuple(slot);
- /* Run the expression, discarding scalar result from the last column. */
- (void) ExecEvalExprSwitchContext(state, econtext, &isnull);
+ /* Run the expression */
+ ExecEvalExprNoReturnSwitchContext(state, econtext);
/*
* Successfully formed a result row. Mark the result slot as containing a
--
2.39.3 (Apple Git-146)
On 7/4/24 6:26 PM, Daniel Gustafsson wrote:
2) We could generate functions which return void rather than NULL and therefore not have to do a return at all but I am not sure that small optimization and extra clarity would be worth the hassle. The current approach with adding Assert() is ok with me. Daniel, what do you think?
I'm not sure that would move the needle enough to warrant the extra complexity.
It could be worth pursuing, but it can be done separately from this.
Agreed.
I looked some more at the patch and have a suggestion for code style.
Attaching the diff. Do with them what you wish, for me they make to code
easier to read.
Andreas
Attachments:
0001-Suggested-style-changes.patchtext/x-patch; charset=UTF-8; name=0001-Suggested-style-changes.patchDownload
From 9ea9a07ce6e4faf728ccc4a7b161a70a214601c8 Mon Sep 17 00:00:00 2001
From: Andreas Karlsson <andreas@proxel.se>
Date: Thu, 18 Jul 2024 21:54:29 +0200
Subject: [PATCH] Suggested style changes
---
src/backend/executor/execExprInterp.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 90f37db62e..845d077b6f 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -546,7 +546,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_CASE(EEOP_DONE_NO_RETURN)
{
Assert(isnull == NULL);
- return 0;
+ return (Datum) 0;
}
EEO_CASE(EEOP_INNER_FETCHSOME)
@@ -1966,13 +1966,10 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
{
/* unreachable */
Assert(false);
- goto out_error;
+ pg_unreachable();
+ return (Datum) 0;
}
}
-
-out_error:
- pg_unreachable();
- return (Datum) 0;
}
/*
--
2.43.0
I have bench marked the two patches now and failed to measure any
speedup or slowdown from the first patch (removing return) but I think
it is a good idea anyway.
For the second patch (optimize strict) I managed to measure a ~1% speed
up for the following query "SELECT sum(x + y + 1) FROM t;" over one
million rows.
I would say both patches are ready for committer modulo my proposed
style fixes.
Andreas
On 22 Jul 2024, at 23:25, Andreas Karlsson <andreas@proxel.se> wrote:
I have bench marked the two patches now and failed to measure any speedup or slowdown from the first patch (removing return) but I think it is a good idea anyway.
For the second patch (optimize strict) I managed to measure a ~1% speed up for the following query "SELECT sum(x + y + 1) FROM t;" over one million rows.
That's expected, this is mostly about refactoring the code to simplifying the
JITed code (and making tiny strides towards JIT expression caching).
I would say both patches are ready for committer modulo my proposed style fixes.
I am a bit wary about removing the out_error label and goto since it may open
up for reports from static analyzers about control reaching the end of a
non-void function without a return. The other change has been incorporated.
The attached v3 is a rebase to handle executor changes done since v2, with the
above mentioned fix as well. If there are no objections I think we should
apply this version.
--
Daniel Gustafsson
Attachments:
v3-0001-Replace-EEOP_DONE-with-special-steps-for-return-n.patchapplication/octet-stream; name=v3-0001-Replace-EEOP_DONE-with-special-steps-for-return-n.patch; x-unix-mode=0644Download
From 4c0fc077155cc688770cc77724a09d7d0129cae2 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Thu, 4 Jul 2024 16:55:52 +0200
Subject: [PATCH v3 1/2] Replace EEOP_DONE with special steps for return/no
return
Knowing when the side-effects of an expression is the intended result
of the execution, rather than the returnvalue, is important for being
able generate more efficient JITed code. This replaces EEOP_DONE with
two new steps: EEOP_DONE_RETURN and EEOP_DONE_NO_RETURN. Expressions
which return a value should use the former step; expressions used for
their side-effects which don't return value should use the latter.
Author: Andres Freund, Daniel Gustafsson
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://postgr.es/m/415721CE-7D2E-4B74-B5D9-1950083BA03E@yesql.se
Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
---
src/backend/executor/README | 11 ++++--
src/backend/executor/execExpr.c | 22 ++++++------
src/backend/executor/execExprInterp.c | 25 ++++++++-----
src/backend/executor/nodeAgg.c | 7 ++--
src/backend/jit/llvm/llvmjit_expr.c | 6 +++-
src/include/executor/execExpr.h | 7 ++--
src/include/executor/executor.h | 52 +++++++++++++++++++++++++--
7 files changed, 97 insertions(+), 33 deletions(-)
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 642d63be61..54f4782f31 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -133,9 +133,14 @@ is used by the function evaluation step, thus avoiding extra work to copy
the result values around.
The last entry in a completed ExprState->steps array is always an
-EEOP_DONE step; this removes the need to test for end-of-array while
-iterating. Also, if the expression contains any variable references (to
-user columns of the ExprContext's INNER, OUTER, or SCAN tuples), the steps
+EEOP_DONE_RETURN or EEOP_DONE_NO_RETURN step; this removes the need to
+test for end-of-array while iterating. The former is used when the
+expression returns a value directly, the latter when side-effects of
+expression initialization are the goal (e.g. for projection or
+aggregate transition value computation).
+
+Also, if the expression contains any variable references (to user
+columns of the ExprContext's INNER, OUTER, or SCAN tuples), the steps
array begins with EEOP_*_FETCHSOME steps that ensure that the relevant
tuples have been deconstructed to make the required columns directly
available (cf. slot_getsomeattrs()). This allows individual Var-fetching
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index c8077aa57b..5351f6a81f 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -8,7 +8,7 @@
* using ExecInitExpr() et al. This converts the tree into a flat array
* of ExprEvalSteps, which may be thought of as instructions in a program.
* At runtime, we'll execute steps, starting with the first, until we reach
- * an EEOP_DONE opcode.
+ * an EEOP_DONE_{RETURN|NO_RETURN} opcode.
*
* This file contains the "compilation" logic. It is independent of the
* specific execution technology we use (switch statement, computed goto,
@@ -157,7 +157,7 @@ ExecInitExpr(Expr *node, PlanState *parent)
ExecInitExprRec(node, state, &state->resvalue, &state->resnull);
/* Finally, append a DONE step */
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -194,7 +194,7 @@ ExecInitExprWithParams(Expr *node, ParamListInfo ext_params)
ExecInitExprRec(node, state, &state->resvalue, &state->resnull);
/* Finally, append a DONE step */
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -286,7 +286,7 @@ ExecInitQual(List *qual, PlanState *parent)
* have yielded TRUE, and since its result is stored in the desired output
* location, we're done.
*/
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -481,7 +481,7 @@ ExecBuildProjectionInfo(List *targetList,
}
}
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_NO_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -720,7 +720,7 @@ ExecBuildUpdateProjection(List *targetList,
}
}
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_NO_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -1656,7 +1656,7 @@ ExecInitExprRec(Expr *node, ExprState *state,
else
{
/* Not trivial, so append a DONE step */
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(elemstate, &scratch);
/* and ready the subexpression */
ExecReadyExpr(elemstate);
@@ -3850,7 +3850,7 @@ ExecBuildAggTrans(AggState *aggstate, AggStatePerPhase phase,
scratch.resvalue = NULL;
scratch.resnull = NULL;
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_NO_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -4102,7 +4102,7 @@ ExecBuildHash32Expr(TupleDesc desc, const TupleTableSlotOps *ops,
scratch.resvalue = NULL;
scratch.resnull = NULL;
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -4255,7 +4255,7 @@ ExecBuildGroupingEqual(TupleDesc ldesc, TupleDesc rdesc,
scratch.resvalue = NULL;
scratch.resnull = NULL;
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -4389,7 +4389,7 @@ ExecBuildParamSetEqual(TupleDesc desc,
scratch.resvalue = NULL;
scratch.resnull = NULL;
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 9fd988cc99..5cabf41341 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -240,7 +240,8 @@ ExecReadyInterpretedExpr(ExprState *state)
/* Simple validity checks on expression */
Assert(state->steps_len >= 1);
- Assert(state->steps[state->steps_len - 1].opcode == EEOP_DONE);
+ Assert(state->steps[state->steps_len - 1].opcode == EEOP_DONE_RETURN ||
+ state->steps[state->steps_len - 1].opcode == EEOP_DONE_NO_RETURN);
/*
* Don't perform redundant initialization. This is unreachable in current
@@ -406,7 +407,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
*/
#if defined(EEO_USE_COMPUTED_GOTO)
static const void *const dispatch_table[] = {
- &&CASE_EEOP_DONE,
+ &&CASE_EEOP_DONE_RETURN,
+ &&CASE_EEOP_DONE_NO_RETURN,
&&CASE_EEOP_INNER_FETCHSOME,
&&CASE_EEOP_OUTER_FETCHSOME,
&&CASE_EEOP_SCAN_FETCHSOME,
@@ -536,9 +538,16 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_SWITCH()
{
- EEO_CASE(EEOP_DONE)
+ EEO_CASE(EEOP_DONE_RETURN)
{
- goto out;
+ *isnull = state->resnull;
+ return state->resvalue;
+ }
+
+ EEO_CASE(EEOP_DONE_NO_RETURN)
+ {
+ Assert(isnull == NULL);
+ return (Datum) 0;
}
EEO_CASE(EEOP_INNER_FETCHSOME)
@@ -2003,13 +2012,13 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
{
/* unreachable */
Assert(false);
- goto out;
+ goto out_error;
}
}
-out:
- *isnull = state->resnull;
- return state->resvalue;
+out_error:
+ pg_unreachable();
+ return (Datum) 0;
}
/*
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 53ead77ece..c8257a23d1 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -815,11 +815,8 @@ advance_transition_function(AggState *aggstate,
static void
advance_aggregates(AggState *aggstate)
{
- bool dummynull;
-
- ExecEvalExprSwitchContext(aggstate->phase->evaltrans,
- aggstate->tmpcontext,
- &dummynull);
+ ExecEvalExprNoReturnSwitchContext(aggstate->phase->evaltrans,
+ aggstate->tmpcontext);
}
/*
diff --git a/src/backend/jit/llvm/llvmjit_expr.c b/src/backend/jit/llvm/llvmjit_expr.c
index 48ccdb942a..28df6ef469 100644
--- a/src/backend/jit/llvm/llvmjit_expr.c
+++ b/src/backend/jit/llvm/llvmjit_expr.c
@@ -285,7 +285,7 @@ llvm_compile_expr(ExprState *state)
switch (opcode)
{
- case EEOP_DONE:
+ case EEOP_DONE_RETURN:
{
LLVMValueRef v_tmpisnull;
LLVMValueRef v_tmpvalue;
@@ -299,6 +299,10 @@ llvm_compile_expr(ExprState *state)
break;
}
+ case EEOP_DONE_NO_RETURN:
+ LLVMBuildRet(b, l_sizet_const(0));
+ break;
+
case EEOP_INNER_FETCHSOME:
case EEOP_OUTER_FETCHSOME:
case EEOP_SCAN_FETCHSOME:
diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h
index eec0aa699e..2abd66b972 100644
--- a/src/include/executor/execExpr.h
+++ b/src/include/executor/execExpr.h
@@ -65,8 +65,11 @@ typedef struct ExprEvalRowtypeCache
*/
typedef enum ExprEvalOp
{
- /* entire expression has been evaluated completely, return */
- EEOP_DONE,
+ /* entire expression has been evaluated, return value */
+ EEOP_DONE_RETURN,
+
+ /* entire expression has been evaluated, no return value */
+ EEOP_DONE_NO_RETURN,
/* apply slot_getsomeattrs on corresponding tuple slot */
EEOP_INNER_FETCHSOME,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 69c3ebff00..507a32a6a2 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -349,6 +349,34 @@ ExecEvalExpr(ExprState *state,
}
#endif
+/*
+ * ExecEvalExprNoReturn
+ *
+ * Like ExecEvalExpr(), but for cases where no return value is expected,
+ * because the side-effects of expression evaluation are what's desired. This
+ * is e.g. used for projection and aggregate transition computation.
+
+ * Evaluate expression identified by "state" in the execution context
+ * given by "econtext".
+ *
+ * The caller should already have switched into the temporary memory context
+ * econtext->ecxt_per_tuple_memory. The convenience entry point
+ * ExecEvalExprNoReturnSwitchContext() is provided for callers who don't
+ * prefer to do the switch in an outer loop.
+ */
+#ifndef FRONTEND
+static inline void
+ExecEvalExprNoReturn(ExprState *state,
+ ExprContext *econtext)
+{
+ PG_USED_FOR_ASSERTS_ONLY Datum retDatum;
+
+ retDatum = state->evalfunc(state, econtext, NULL);
+
+ Assert(retDatum == (Datum) 0);
+}
+#endif
+
/*
* ExecEvalExprSwitchContext
*
@@ -370,6 +398,25 @@ ExecEvalExprSwitchContext(ExprState *state,
}
#endif
+/*
+ * ExecEvalExprNoReturnSwitchContext
+ *
+ * Same as ExecEvalExprNoReturn, but get into the right allocation context
+ * explicitly.
+ */
+#ifndef FRONTEND
+static inline void
+ExecEvalExprNoReturnSwitchContext(ExprState *state,
+ ExprContext *econtext)
+{
+ MemoryContext oldContext;
+
+ oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
+ ExecEvalExprNoReturn(state, econtext);
+ MemoryContextSwitchTo(oldContext);
+}
+#endif
+
/*
* ExecProject
*
@@ -389,7 +436,6 @@ ExecProject(ProjectionInfo *projInfo)
ExprContext *econtext = projInfo->pi_exprContext;
ExprState *state = &projInfo->pi_state;
TupleTableSlot *slot = state->resultslot;
- bool isnull;
/*
* Clear any former contents of the result slot. This makes it safe for
@@ -397,8 +443,8 @@ ExecProject(ProjectionInfo *projInfo)
*/
ExecClearTuple(slot);
- /* Run the expression, discarding scalar result from the last column. */
- (void) ExecEvalExprSwitchContext(state, econtext, &isnull);
+ /* Run the expression */
+ ExecEvalExprNoReturnSwitchContext(state, econtext);
/*
* Successfully formed a result row. Mark the result slot as containing a
--
2.39.3 (Apple Git-146)
v3-0002-Add-special-case-fast-paths-for-strict-functions.patchapplication/octet-stream; name=v3-0002-Add-special-case-fast-paths-for-strict-functions.patch; x-unix-mode=0644Download
From 64c250efc9522dfdfe1792f2e50b1369daa3387c Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Thu, 4 Jul 2024 17:58:23 +0200
Subject: [PATCH v3 2/2] Add special case fast-paths for strict functions
Many STRICT function calls will have one or two arguments, in which
case we can speed up checking for NULL input by avoiding setting up
a loop over the arguments. This adds EEOP_FUNCEXPR_STRICT_1 and the
corresponding EEOP_FUNCEXPR_STRICT_2 for functions with one and two
arguments respectively.
Author: Andres Freund, Daniel Gustafsson
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://postgr.es/m/415721CE-7D2E-4B74-B5D9-1950083BA03E@yesql.se
Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
---
src/backend/executor/execExpr.c | 13 ++++-
src/backend/executor/execExprInterp.c | 74 ++++++++++++++++++++++++++-
src/backend/jit/llvm/llvmjit_expr.c | 7 ++-
src/include/executor/execExpr.h | 7 ++-
4 files changed, 96 insertions(+), 5 deletions(-)
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index 5351f6a81f..39225056f0 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -2692,7 +2692,15 @@ ExecInitFunc(ExprEvalStep *scratch, Expr *node, List *args, Oid funcid,
if (pgstat_track_functions <= flinfo->fn_stats)
{
if (flinfo->fn_strict && nargs > 0)
- scratch->opcode = EEOP_FUNCEXPR_STRICT;
+ {
+ /* Choose nargs optimized implementation if available. */
+ if (nargs == 1)
+ scratch->opcode = EEOP_FUNCEXPR_STRICT_1;
+ else if (nargs == 2)
+ scratch->opcode = EEOP_FUNCEXPR_STRICT_2;
+ else
+ scratch->opcode = EEOP_FUNCEXPR_STRICT;
+ }
else
scratch->opcode = EEOP_FUNCEXPR;
}
@@ -3751,6 +3759,8 @@ ExecBuildAggTrans(AggState *aggstate, AggStatePerPhase phase,
{
if (strictnulls)
scratch.opcode = EEOP_AGG_STRICT_INPUT_CHECK_NULLS;
+ else if (strictargs && pertrans->numTransInputs == 1)
+ scratch.opcode = EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1;
else
scratch.opcode = EEOP_AGG_STRICT_INPUT_CHECK_ARGS;
scratch.d.agg_strict_input_check.nulls = strictnulls;
@@ -3827,6 +3837,7 @@ ExecBuildAggTrans(AggState *aggstate, AggStatePerPhase phase,
as->d.jump.jumpdone = state->steps_len;
}
else if (as->opcode == EEOP_AGG_STRICT_INPUT_CHECK_ARGS ||
+ as->opcode == EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1 ||
as->opcode == EEOP_AGG_STRICT_INPUT_CHECK_NULLS)
{
Assert(as->d.agg_strict_input_check.jumpnull == -1);
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 5cabf41341..15e3aa1175 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -316,7 +316,9 @@ ExecReadyInterpretedExpr(ExprState *state)
return;
}
else if (step0 == EEOP_CASE_TESTVAL &&
- step1 == EEOP_FUNCEXPR_STRICT &&
+ (step1 == EEOP_FUNCEXPR_STRICT ||
+ step1 == EEOP_FUNCEXPR_STRICT_1 ||
+ step1 == EEOP_FUNCEXPR_STRICT_2) &&
state->steps[0].d.casetest.value)
{
state->evalfunc_private = (void *) ExecJustApplyFuncToCase;
@@ -427,6 +429,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
&&CASE_EEOP_CONST,
&&CASE_EEOP_FUNCEXPR,
&&CASE_EEOP_FUNCEXPR_STRICT,
+ &&CASE_EEOP_FUNCEXPR_STRICT_1,
+ &&CASE_EEOP_FUNCEXPR_STRICT_2,
&&CASE_EEOP_FUNCEXPR_FUSAGE,
&&CASE_EEOP_FUNCEXPR_STRICT_FUSAGE,
&&CASE_EEOP_BOOL_AND_STEP_FIRST,
@@ -501,6 +505,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
&&CASE_EEOP_AGG_STRICT_DESERIALIZE,
&&CASE_EEOP_AGG_DESERIALIZE,
&&CASE_EEOP_AGG_STRICT_INPUT_CHECK_ARGS,
+ &&CASE_EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1,
&&CASE_EEOP_AGG_STRICT_INPUT_CHECK_NULLS,
&&CASE_EEOP_AGG_PLAIN_PERGROUP_NULLCHECK,
&&CASE_EEOP_AGG_PLAIN_TRANS_INIT_STRICT_BYVAL,
@@ -759,6 +764,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_NEXT();
}
+ /* strict function call with more than two arguments */
EEO_CASE(EEOP_FUNCEXPR_STRICT)
{
FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
@@ -766,6 +772,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
int nargs = op->d.func.nargs;
Datum d;
+ Assert(nargs > 2);
+
/* strict function, so check for NULL args */
for (int argno = 0; argno < nargs; argno++)
{
@@ -784,6 +792,54 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_NEXT();
}
+ /* strict function call with one argument */
+ EEO_CASE(EEOP_FUNCEXPR_STRICT_1)
+ {
+ FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
+ NullableDatum *args = fcinfo->args;
+
+ Assert(op->d.func.nargs == 1);
+
+ /* strict function, so check for NULL args */
+ if (args[0].isnull)
+ *op->resnull = true;
+ else
+ {
+ Datum d;
+
+ fcinfo->isnull = false;
+ d = op->d.func.fn_addr(fcinfo);
+ *op->resvalue = d;
+ *op->resnull = fcinfo->isnull;
+ }
+
+ EEO_NEXT();
+ }
+
+ /* strict function call with two arguments */
+ EEO_CASE(EEOP_FUNCEXPR_STRICT_2)
+ {
+ FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
+ NullableDatum *args = fcinfo->args;
+
+ Assert(op->d.func.nargs == 2);
+
+ /* strict function, so check for NULL args */
+ if (args[0].isnull || args[1].isnull)
+ *op->resnull = true;
+ else
+ {
+ Datum d;
+
+ fcinfo->isnull = false;
+ d = op->d.func.fn_addr(fcinfo);
+ *op->resvalue = d;
+ *op->resnull = fcinfo->isnull;
+ }
+
+ EEO_NEXT();
+ }
+
EEO_CASE(EEOP_FUNCEXPR_FUSAGE)
{
/* not common enough to inline */
@@ -1797,11 +1853,14 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
* input is not NULL.
*/
+ /* when checking more than one argument */
EEO_CASE(EEOP_AGG_STRICT_INPUT_CHECK_ARGS)
{
NullableDatum *args = op->d.agg_strict_input_check.args;
int nargs = op->d.agg_strict_input_check.nargs;
+ Assert(nargs > 1);
+
for (int argno = 0; argno < nargs; argno++)
{
if (args[argno].isnull)
@@ -1810,6 +1869,19 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_NEXT();
}
+ /* special case for just one argument */
+ EEO_CASE(EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1)
+ {
+ NullableDatum *args = op->d.agg_strict_input_check.args;
+ PG_USED_FOR_ASSERTS_ONLY int nargs = op->d.agg_strict_input_check.nargs;
+
+ Assert(nargs == 1);
+
+ if (args[0].isnull)
+ EEO_JUMP(op->d.agg_strict_input_check.jumpnull);
+ EEO_NEXT();
+ }
+
EEO_CASE(EEOP_AGG_STRICT_INPUT_CHECK_NULLS)
{
bool *nulls = op->d.agg_strict_input_check.nulls;
diff --git a/src/backend/jit/llvm/llvmjit_expr.c b/src/backend/jit/llvm/llvmjit_expr.c
index 28df6ef469..7c88ef4302 100644
--- a/src/backend/jit/llvm/llvmjit_expr.c
+++ b/src/backend/jit/llvm/llvmjit_expr.c
@@ -590,12 +590,16 @@ llvm_compile_expr(ExprState *state)
case EEOP_FUNCEXPR:
case EEOP_FUNCEXPR_STRICT:
+ case EEOP_FUNCEXPR_STRICT_1:
+ case EEOP_FUNCEXPR_STRICT_2:
{
FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
LLVMValueRef v_fcinfo_isnull;
LLVMValueRef v_retval;
- if (opcode == EEOP_FUNCEXPR_STRICT)
+ if (opcode == EEOP_FUNCEXPR_STRICT ||
+ opcode == EEOP_FUNCEXPR_STRICT_1 ||
+ opcode == EEOP_FUNCEXPR_STRICT_2)
{
LLVMBasicBlockRef b_nonull;
LLVMBasicBlockRef *b_checkargnulls;
@@ -2374,6 +2378,7 @@ llvm_compile_expr(ExprState *state)
}
case EEOP_AGG_STRICT_INPUT_CHECK_ARGS:
+ case EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1:
case EEOP_AGG_STRICT_INPUT_CHECK_NULLS:
{
int nargs = op->d.agg_strict_input_check.nargs;
diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h
index 2abd66b972..0d2a49f037 100644
--- a/src/include/executor/execExpr.h
+++ b/src/include/executor/execExpr.h
@@ -108,11 +108,13 @@ typedef enum ExprEvalOp
/*
* Evaluate function call (including OpExprs etc). For speed, we
- * distinguish in the opcode whether the function is strict and/or
- * requires usage stats tracking.
+ * distinguish in the opcode whether the function is strict with 1, 2, or
+ * more arguments and/or requires usage stats tracking.
*/
EEOP_FUNCEXPR,
EEOP_FUNCEXPR_STRICT,
+ EEOP_FUNCEXPR_STRICT_1,
+ EEOP_FUNCEXPR_STRICT_2,
EEOP_FUNCEXPR_FUSAGE,
EEOP_FUNCEXPR_STRICT_FUSAGE,
@@ -265,6 +267,7 @@ typedef enum ExprEvalOp
EEOP_AGG_STRICT_DESERIALIZE,
EEOP_AGG_DESERIALIZE,
EEOP_AGG_STRICT_INPUT_CHECK_ARGS,
+ EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1,
EEOP_AGG_STRICT_INPUT_CHECK_NULLS,
EEOP_AGG_PLAIN_PERGROUP_NULLCHECK,
EEOP_AGG_PLAIN_TRANS_INIT_STRICT_BYVAL,
--
2.39.3 (Apple Git-146)
On 9/10/24 10:54 AM, Daniel Gustafsson wrote:
On 22 Jul 2024, at 23:25, Andreas Karlsson <andreas@proxel.se> wrote:
I have bench marked the two patches now and failed to measure any speedup or slowdown from the first patch (removing return) but I think it is a good idea anyway.
For the second patch (optimize strict) I managed to measure a ~1% speed up for the following query "SELECT sum(x + y + 1) FROM t;" over one million rows.
That's expected, this is mostly about refactoring the code to simplifying the
JITed code (and making tiny strides towards JIT expression caching).
Yup! Expected and nice tiny speedup.
I would say both patches are ready for committer modulo my proposed style fixes.
I am a bit wary about removing the out_error label and goto since it may open
up for reports from static analyzers about control reaching the end of a
non-void function without a return. The other change has been incorporated.The attached v3 is a rebase to handle executor changes done since v2, with the
above mentioned fix as well. If there are no objections I think we should
apply this version.
Sounds good to me and in my opinion this is ready to be committed.
Andreas
On 13 Sep 2024, at 15:01, Andreas Karlsson <andreas@proxel.se> wrote:
On 9/10/24 10:54 AM, Daniel Gustafsson wrote:
On 22 Jul 2024, at 23:25, Andreas Karlsson <andreas@proxel.se> wrote:
I have bench marked the two patches now and failed to measure any speedup or slowdown from the first patch (removing return) but I think it is a good idea anyway.
For the second patch (optimize strict) I managed to measure a ~1% speed up for the following query "SELECT sum(x + y + 1) FROM t;" over one million rows.
That's expected, this is mostly about refactoring the code to simplifying the
JITed code (and making tiny strides towards JIT expression caching).Yup! Expected and nice tiny speedup.
I would say both patches are ready for committer modulo my proposed style fixes.
I am a bit wary about removing the out_error label and goto since it may open
up for reports from static analyzers about control reaching the end of a
non-void function without a return. The other change has been incorporated.
The attached v3 is a rebase to handle executor changes done since v2, with the
above mentioned fix as well. If there are no objections I think we should
apply this version.Sounds good to me and in my opinion this is ready to be committed.
This fell off the ever-growing TODO again. Re-reading it I still think it's a
good idea, it applied almost cleanly still and still gives a slight performance
improvement along with the more interesting refactoring which will make caching
of expressions easier down the line.
Attached is a rebased v4, unless there are objections I'll go ahead with this
version.
--
Daniel Gustafsson
Attachments:
v4-0002-Add-special-case-fast-paths-for-strict-functions.patchapplication/octet-stream; name=v4-0002-Add-special-case-fast-paths-for-strict-functions.patch; x-unix-mode=0644Download
From 91d594b90b0b84abfe30c0c090d4809ec2f5b4fa Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Wed, 5 Feb 2025 10:19:51 +0100
Subject: [PATCH v4 2/2] Add special case fast-paths for strict functions
Many STRICT function calls will have one or two arguments, in which
case we can speed up checking for NULL input by avoiding setting up
a loop over the arguments. This adds EEOP_FUNCEXPR_STRICT_1 and the
corresponding EEOP_FUNCEXPR_STRICT_2 for functions with one and two
arguments respectively.
Author: Andres Freund, Daniel Gustafsson
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://postgr.es/m/415721CE-7D2E-4B74-B5D9-1950083BA03E@yesql.se
Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
---
src/backend/executor/execExpr.c | 15 +++++-
src/backend/executor/execExprInterp.c | 74 ++++++++++++++++++++++++++-
src/backend/jit/llvm/llvmjit_expr.c | 7 ++-
src/include/executor/execExpr.h | 7 ++-
4 files changed, 97 insertions(+), 6 deletions(-)
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index 6e56d1468db..f1569879b52 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -2788,7 +2788,15 @@ ExecInitFunc(ExprEvalStep *scratch, Expr *node, List *args, Oid funcid,
if (pgstat_track_functions <= flinfo->fn_stats)
{
if (flinfo->fn_strict && nargs > 0)
- scratch->opcode = EEOP_FUNCEXPR_STRICT;
+ {
+ /* Choose nargs optimized implementation if available. */
+ if (nargs == 1)
+ scratch->opcode = EEOP_FUNCEXPR_STRICT_1;
+ else if (nargs == 2)
+ scratch->opcode = EEOP_FUNCEXPR_STRICT_2;
+ else
+ scratch->opcode = EEOP_FUNCEXPR_STRICT;
+ }
else
scratch->opcode = EEOP_FUNCEXPR;
}
@@ -3892,6 +3900,8 @@ ExecBuildAggTrans(AggState *aggstate, AggStatePerPhase phase,
{
if (strictnulls)
scratch.opcode = EEOP_AGG_STRICT_INPUT_CHECK_NULLS;
+ else if (strictargs && pertrans->numTransInputs == 1)
+ scratch.opcode = EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1;
else
scratch.opcode = EEOP_AGG_STRICT_INPUT_CHECK_ARGS;
scratch.d.agg_strict_input_check.nulls = strictnulls;
@@ -3968,6 +3978,7 @@ ExecBuildAggTrans(AggState *aggstate, AggStatePerPhase phase,
as->d.jump.jumpdone = state->steps_len;
}
else if (as->opcode == EEOP_AGG_STRICT_INPUT_CHECK_ARGS ||
+ as->opcode == EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1 ||
as->opcode == EEOP_AGG_STRICT_INPUT_CHECK_NULLS)
{
Assert(as->d.agg_strict_input_check.jumpnull == -1);
@@ -4722,7 +4733,7 @@ ExecBuildParamSetEqual(TupleDesc desc,
scratch.resvalue = NULL;
scratch.resnull = NULL;
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 75d7ce469f8..8e9126a244b 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -366,7 +366,9 @@ ExecReadyInterpretedExpr(ExprState *state)
return;
}
else if (step0 == EEOP_CASE_TESTVAL &&
- step1 == EEOP_FUNCEXPR_STRICT)
+ (step1 == EEOP_FUNCEXPR_STRICT ||
+ step1 == EEOP_FUNCEXPR_STRICT_1 ||
+ step1 == EEOP_FUNCEXPR_STRICT_2))
{
state->evalfunc_private = ExecJustApplyFuncToCase;
return;
@@ -498,6 +500,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
&&CASE_EEOP_CONST,
&&CASE_EEOP_FUNCEXPR,
&&CASE_EEOP_FUNCEXPR_STRICT,
+ &&CASE_EEOP_FUNCEXPR_STRICT_1,
+ &&CASE_EEOP_FUNCEXPR_STRICT_2,
&&CASE_EEOP_FUNCEXPR_FUSAGE,
&&CASE_EEOP_FUNCEXPR_STRICT_FUSAGE,
&&CASE_EEOP_BOOL_AND_STEP_FIRST,
@@ -575,6 +579,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
&&CASE_EEOP_AGG_STRICT_DESERIALIZE,
&&CASE_EEOP_AGG_DESERIALIZE,
&&CASE_EEOP_AGG_STRICT_INPUT_CHECK_ARGS,
+ &&CASE_EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1,
&&CASE_EEOP_AGG_STRICT_INPUT_CHECK_NULLS,
&&CASE_EEOP_AGG_PLAIN_PERGROUP_NULLCHECK,
&&CASE_EEOP_AGG_PLAIN_TRANS_INIT_STRICT_BYVAL,
@@ -925,6 +930,7 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_NEXT();
}
+ /* strict function call with more than two arguments */
EEO_CASE(EEOP_FUNCEXPR_STRICT)
{
FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
@@ -932,6 +938,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
int nargs = op->d.func.nargs;
Datum d;
+ Assert(nargs > 2);
+
/* strict function, so check for NULL args */
for (int argno = 0; argno < nargs; argno++)
{
@@ -950,6 +958,54 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_NEXT();
}
+ /* strict function call with one argument */
+ EEO_CASE(EEOP_FUNCEXPR_STRICT_1)
+ {
+ FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
+ NullableDatum *args = fcinfo->args;
+
+ Assert(op->d.func.nargs == 1);
+
+ /* strict function, so check for NULL args */
+ if (args[0].isnull)
+ *op->resnull = true;
+ else
+ {
+ Datum d;
+
+ fcinfo->isnull = false;
+ d = op->d.func.fn_addr(fcinfo);
+ *op->resvalue = d;
+ *op->resnull = fcinfo->isnull;
+ }
+
+ EEO_NEXT();
+ }
+
+ /* strict function call with two arguments */
+ EEO_CASE(EEOP_FUNCEXPR_STRICT_2)
+ {
+ FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
+ NullableDatum *args = fcinfo->args;
+
+ Assert(op->d.func.nargs == 2);
+
+ /* strict function, so check for NULL args */
+ if (args[0].isnull || args[1].isnull)
+ *op->resnull = true;
+ else
+ {
+ Datum d;
+
+ fcinfo->isnull = false;
+ d = op->d.func.fn_addr(fcinfo);
+ *op->resvalue = d;
+ *op->resnull = fcinfo->isnull;
+ }
+
+ EEO_NEXT();
+ }
+
EEO_CASE(EEOP_FUNCEXPR_FUSAGE)
{
/* not common enough to inline */
@@ -1982,11 +2038,14 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
* input is not NULL.
*/
+ /* when checking more than one argument */
EEO_CASE(EEOP_AGG_STRICT_INPUT_CHECK_ARGS)
{
NullableDatum *args = op->d.agg_strict_input_check.args;
int nargs = op->d.agg_strict_input_check.nargs;
+ Assert(nargs > 1);
+
for (int argno = 0; argno < nargs; argno++)
{
if (args[argno].isnull)
@@ -1995,6 +2054,19 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_NEXT();
}
+ /* special case for just one argument */
+ EEO_CASE(EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1)
+ {
+ NullableDatum *args = op->d.agg_strict_input_check.args;
+ PG_USED_FOR_ASSERTS_ONLY int nargs = op->d.agg_strict_input_check.nargs;
+
+ Assert(nargs == 1);
+
+ if (args[0].isnull)
+ EEO_JUMP(op->d.agg_strict_input_check.jumpnull);
+ EEO_NEXT();
+ }
+
EEO_CASE(EEOP_AGG_STRICT_INPUT_CHECK_NULLS)
{
bool *nulls = op->d.agg_strict_input_check.nulls;
diff --git a/src/backend/jit/llvm/llvmjit_expr.c b/src/backend/jit/llvm/llvmjit_expr.c
index 4080b01c17e..890bcb0b0a7 100644
--- a/src/backend/jit/llvm/llvmjit_expr.c
+++ b/src/backend/jit/llvm/llvmjit_expr.c
@@ -662,12 +662,16 @@ llvm_compile_expr(ExprState *state)
case EEOP_FUNCEXPR:
case EEOP_FUNCEXPR_STRICT:
+ case EEOP_FUNCEXPR_STRICT_1:
+ case EEOP_FUNCEXPR_STRICT_2:
{
FunctionCallInfo fcinfo = op->d.func.fcinfo_data;
LLVMValueRef v_fcinfo_isnull;
LLVMValueRef v_retval;
- if (opcode == EEOP_FUNCEXPR_STRICT)
+ if (opcode == EEOP_FUNCEXPR_STRICT ||
+ opcode == EEOP_FUNCEXPR_STRICT_1 ||
+ opcode == EEOP_FUNCEXPR_STRICT_2)
{
LLVMBasicBlockRef b_nonull;
LLVMBasicBlockRef *b_checkargnulls;
@@ -2482,6 +2486,7 @@ llvm_compile_expr(ExprState *state)
}
case EEOP_AGG_STRICT_INPUT_CHECK_ARGS:
+ case EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1:
case EEOP_AGG_STRICT_INPUT_CHECK_NULLS:
{
int nargs = op->d.agg_strict_input_check.nargs;
diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h
index 9d549198e21..b6640e84a61 100644
--- a/src/include/executor/execExpr.h
+++ b/src/include/executor/execExpr.h
@@ -116,11 +116,13 @@ typedef enum ExprEvalOp
/*
* Evaluate function call (including OpExprs etc). For speed, we
- * distinguish in the opcode whether the function is strict and/or
- * requires usage stats tracking.
+ * distinguish in the opcode whether the function is strict with 1, 2, or
+ * more arguments and/or requires usage stats tracking.
*/
EEOP_FUNCEXPR,
EEOP_FUNCEXPR_STRICT,
+ EEOP_FUNCEXPR_STRICT_1,
+ EEOP_FUNCEXPR_STRICT_2,
EEOP_FUNCEXPR_FUSAGE,
EEOP_FUNCEXPR_STRICT_FUSAGE,
@@ -276,6 +278,7 @@ typedef enum ExprEvalOp
EEOP_AGG_STRICT_DESERIALIZE,
EEOP_AGG_DESERIALIZE,
EEOP_AGG_STRICT_INPUT_CHECK_ARGS,
+ EEOP_AGG_STRICT_INPUT_CHECK_ARGS_1,
EEOP_AGG_STRICT_INPUT_CHECK_NULLS,
EEOP_AGG_PLAIN_PERGROUP_NULLCHECK,
EEOP_AGG_PLAIN_TRANS_INIT_STRICT_BYVAL,
--
2.39.3 (Apple Git-146)
v4-0001-Replace-EEOP_DONE-with-special-steps-for-return-n.patchapplication/octet-stream; name=v4-0001-Replace-EEOP_DONE-with-special-steps-for-return-n.patch; x-unix-mode=0644Download
From 09c3018ee542d02702bb51eb8c11ac1fdea2a704 Mon Sep 17 00:00:00 2001
From: Daniel Gustafsson <dgustafsson@postgresql.org>
Date: Thu, 4 Jul 2024 16:55:52 +0200
Subject: [PATCH v4 1/2] Replace EEOP_DONE with special steps for return/no
return
Knowing when the side-effects of an expression is the intended result
of the execution, rather than the returnvalue, is important for being
able generate more efficient JITed code. This replaces EEOP_DONE with
two new steps: EEOP_DONE_RETURN and EEOP_DONE_NO_RETURN. Expressions
which return a value should use the former step; expressions used for
their side-effects which don't return value should use the latter.
Author: Andres Freund, Daniel Gustafsson
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Discussion: https://postgr.es/m/415721CE-7D2E-4B74-B5D9-1950083BA03E@yesql.se
Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de
---
src/backend/executor/README | 11 ++++--
src/backend/executor/execExpr.c | 22 ++++++------
src/backend/executor/execExprInterp.c | 25 ++++++++-----
src/backend/executor/nodeAgg.c | 7 ++--
src/backend/jit/llvm/llvmjit_expr.c | 6 +++-
src/include/executor/execExpr.h | 7 ++--
src/include/executor/executor.h | 52 +++++++++++++++++++++++++--
7 files changed, 97 insertions(+), 33 deletions(-)
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 642d63be613..54f4782f31b 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -133,9 +133,14 @@ is used by the function evaluation step, thus avoiding extra work to copy
the result values around.
The last entry in a completed ExprState->steps array is always an
-EEOP_DONE step; this removes the need to test for end-of-array while
-iterating. Also, if the expression contains any variable references (to
-user columns of the ExprContext's INNER, OUTER, or SCAN tuples), the steps
+EEOP_DONE_RETURN or EEOP_DONE_NO_RETURN step; this removes the need to
+test for end-of-array while iterating. The former is used when the
+expression returns a value directly, the latter when side-effects of
+expression initialization are the goal (e.g. for projection or
+aggregate transition value computation).
+
+Also, if the expression contains any variable references (to user
+columns of the ExprContext's INNER, OUTER, or SCAN tuples), the steps
array begins with EEOP_*_FETCHSOME steps that ensure that the relevant
tuples have been deconstructed to make the required columns directly
available (cf. slot_getsomeattrs()). This allows individual Var-fetching
diff --git a/src/backend/executor/execExpr.c b/src/backend/executor/execExpr.c
index 03566c4d181..6e56d1468db 100644
--- a/src/backend/executor/execExpr.c
+++ b/src/backend/executor/execExpr.c
@@ -8,7 +8,7 @@
* using ExecInitExpr() et al. This converts the tree into a flat array
* of ExprEvalSteps, which may be thought of as instructions in a program.
* At runtime, we'll execute steps, starting with the first, until we reach
- * an EEOP_DONE opcode.
+ * an EEOP_DONE_{RETURN|NO_RETURN} opcode.
*
* This file contains the "compilation" logic. It is independent of the
* specific execution technology we use (switch statement, computed goto,
@@ -162,7 +162,7 @@ ExecInitExpr(Expr *node, PlanState *parent)
ExecInitExprRec(node, state, &state->resvalue, &state->resnull);
/* Finally, append a DONE step */
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -199,7 +199,7 @@ ExecInitExprWithParams(Expr *node, ParamListInfo ext_params)
ExecInitExprRec(node, state, &state->resvalue, &state->resnull);
/* Finally, append a DONE step */
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -291,7 +291,7 @@ ExecInitQual(List *qual, PlanState *parent)
* have yielded TRUE, and since its result is stored in the desired output
* location, we're done.
*/
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -503,7 +503,7 @@ ExecBuildProjectionInfo(List *targetList,
}
}
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_NO_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -742,7 +742,7 @@ ExecBuildUpdateProjection(List *targetList,
}
}
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_NO_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -1714,7 +1714,7 @@ ExecInitExprRec(Expr *node, ExprState *state,
else
{
/* Not trivial, so append a DONE step */
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(elemstate, &scratch);
/* and ready the subexpression */
ExecReadyExpr(elemstate);
@@ -3991,7 +3991,7 @@ ExecBuildAggTrans(AggState *aggstate, AggStatePerPhase phase,
scratch.resvalue = NULL;
scratch.resnull = NULL;
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_NO_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -4258,7 +4258,7 @@ ExecBuildHash32FromAttrs(TupleDesc desc, const TupleTableSlotOps *ops,
scratch.resvalue = NULL;
scratch.resnull = NULL;
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -4431,7 +4431,7 @@ ExecBuildHash32Expr(TupleDesc desc, const TupleTableSlotOps *ops,
scratch.resvalue = NULL;
scratch.resnull = NULL;
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
@@ -4586,7 +4586,7 @@ ExecBuildGroupingEqual(TupleDesc ldesc, TupleDesc rdesc,
scratch.resvalue = NULL;
scratch.resnull = NULL;
- scratch.opcode = EEOP_DONE;
+ scratch.opcode = EEOP_DONE_RETURN;
ExprEvalPushStep(state, &scratch);
ExecReadyExpr(state);
diff --git a/src/backend/executor/execExprInterp.c b/src/backend/executor/execExprInterp.c
index 09f6a5f14c1..75d7ce469f8 100644
--- a/src/backend/executor/execExprInterp.c
+++ b/src/backend/executor/execExprInterp.c
@@ -246,7 +246,8 @@ ExecReadyInterpretedExpr(ExprState *state)
/* Simple validity checks on expression */
Assert(state->steps_len >= 1);
- Assert(state->steps[state->steps_len - 1].opcode == EEOP_DONE);
+ Assert(state->steps[state->steps_len - 1].opcode == EEOP_DONE_RETURN ||
+ state->steps[state->steps_len - 1].opcode == EEOP_DONE_NO_RETURN);
/*
* Don't perform redundant initialization. This is unreachable in current
@@ -469,7 +470,8 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
*/
#if defined(EEO_USE_COMPUTED_GOTO)
static const void *const dispatch_table[] = {
- &&CASE_EEOP_DONE,
+ &&CASE_EEOP_DONE_RETURN,
+ &&CASE_EEOP_DONE_NO_RETURN,
&&CASE_EEOP_INNER_FETCHSOME,
&&CASE_EEOP_OUTER_FETCHSOME,
&&CASE_EEOP_SCAN_FETCHSOME,
@@ -612,9 +614,16 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
EEO_SWITCH()
{
- EEO_CASE(EEOP_DONE)
+ EEO_CASE(EEOP_DONE_RETURN)
{
- goto out;
+ *isnull = state->resnull;
+ return state->resvalue;
+ }
+
+ EEO_CASE(EEOP_DONE_NO_RETURN)
+ {
+ Assert(isnull == NULL);
+ return (Datum) 0;
}
EEO_CASE(EEOP_INNER_FETCHSOME)
@@ -2188,13 +2197,13 @@ ExecInterpExpr(ExprState *state, ExprContext *econtext, bool *isnull)
{
/* unreachable */
Assert(false);
- goto out;
+ goto out_error;
}
}
-out:
- *isnull = state->resnull;
- return state->resvalue;
+out_error:
+ pg_unreachable();
+ return (Datum) 0;
}
/*
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 3005b5c0e3b..73b6d18a15d 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -815,11 +815,8 @@ advance_transition_function(AggState *aggstate,
static void
advance_aggregates(AggState *aggstate)
{
- bool dummynull;
-
- ExecEvalExprSwitchContext(aggstate->phase->evaltrans,
- aggstate->tmpcontext,
- &dummynull);
+ ExecEvalExprNoReturnSwitchContext(aggstate->phase->evaltrans,
+ aggstate->tmpcontext);
}
/*
diff --git a/src/backend/jit/llvm/llvmjit_expr.c b/src/backend/jit/llvm/llvmjit_expr.c
index f0f5c3bd49f..4080b01c17e 100644
--- a/src/backend/jit/llvm/llvmjit_expr.c
+++ b/src/backend/jit/llvm/llvmjit_expr.c
@@ -321,7 +321,7 @@ llvm_compile_expr(ExprState *state)
switch (opcode)
{
- case EEOP_DONE:
+ case EEOP_DONE_RETURN:
{
LLVMValueRef v_tmpisnull;
LLVMValueRef v_tmpvalue;
@@ -335,6 +335,10 @@ llvm_compile_expr(ExprState *state)
break;
}
+ case EEOP_DONE_NO_RETURN:
+ LLVMBuildRet(b, l_sizet_const(0));
+ break;
+
case EEOP_INNER_FETCHSOME:
case EEOP_OUTER_FETCHSOME:
case EEOP_SCAN_FETCHSOME:
diff --git a/src/include/executor/execExpr.h b/src/include/executor/execExpr.h
index 51bd35dcb07..9d549198e21 100644
--- a/src/include/executor/execExpr.h
+++ b/src/include/executor/execExpr.h
@@ -65,8 +65,11 @@ typedef struct ExprEvalRowtypeCache
*/
typedef enum ExprEvalOp
{
- /* entire expression has been evaluated completely, return */
- EEOP_DONE,
+ /* entire expression has been evaluated, return value */
+ EEOP_DONE_RETURN,
+
+ /* entire expression has been evaluated, no return value */
+ EEOP_DONE_NO_RETURN,
/* apply slot_getsomeattrs on corresponding tuple slot */
EEOP_INNER_FETCHSOME,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 45b80e6b98e..e367d776468 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -351,6 +351,34 @@ ExecEvalExpr(ExprState *state,
}
#endif
+/*
+ * ExecEvalExprNoReturn
+ *
+ * Like ExecEvalExpr(), but for cases where no return value is expected,
+ * because the side-effects of expression evaluation are what's desired. This
+ * is e.g. used for projection and aggregate transition computation.
+
+ * Evaluate expression identified by "state" in the execution context
+ * given by "econtext".
+ *
+ * The caller should already have switched into the temporary memory context
+ * econtext->ecxt_per_tuple_memory. The convenience entry point
+ * ExecEvalExprNoReturnSwitchContext() is provided for callers who don't
+ * prefer to do the switch in an outer loop.
+ */
+#ifndef FRONTEND
+static inline void
+ExecEvalExprNoReturn(ExprState *state,
+ ExprContext *econtext)
+{
+ PG_USED_FOR_ASSERTS_ONLY Datum retDatum;
+
+ retDatum = state->evalfunc(state, econtext, NULL);
+
+ Assert(retDatum == (Datum) 0);
+}
+#endif
+
/*
* ExecEvalExprSwitchContext
*
@@ -372,6 +400,25 @@ ExecEvalExprSwitchContext(ExprState *state,
}
#endif
+/*
+ * ExecEvalExprNoReturnSwitchContext
+ *
+ * Same as ExecEvalExprNoReturn, but get into the right allocation context
+ * explicitly.
+ */
+#ifndef FRONTEND
+static inline void
+ExecEvalExprNoReturnSwitchContext(ExprState *state,
+ ExprContext *econtext)
+{
+ MemoryContext oldContext;
+
+ oldContext = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);
+ ExecEvalExprNoReturn(state, econtext);
+ MemoryContextSwitchTo(oldContext);
+}
+#endif
+
/*
* ExecProject
*
@@ -391,7 +438,6 @@ ExecProject(ProjectionInfo *projInfo)
ExprContext *econtext = projInfo->pi_exprContext;
ExprState *state = &projInfo->pi_state;
TupleTableSlot *slot = state->resultslot;
- bool isnull;
/*
* Clear any former contents of the result slot. This makes it safe for
@@ -399,8 +445,8 @@ ExecProject(ProjectionInfo *projInfo)
*/
ExecClearTuple(slot);
- /* Run the expression, discarding scalar result from the last column. */
- (void) ExecEvalExprSwitchContext(state, econtext, &isnull);
+ /* Run the expression */
+ ExecEvalExprNoReturnSwitchContext(state, econtext);
/*
* Successfully formed a result row. Mark the result slot as containing a
--
2.39.3 (Apple Git-146)
On 2/5/25 11:00 AM, Daniel Gustafsson wrote:
This fell off the ever-growing TODO again. Re-reading it I still think it's a
good idea, it applied almost cleanly still and still gives a slight performance
improvement along with the more interesting refactoring which will make caching
of expressions easier down the line.Attached is a rebased v4, unless there are objections I'll go ahead with this
version.
+1 Still seems like a nice change to me too.
Andreas
On 8 Mar 2025, at 17:15, Andreas Karlsson <andreas@proxel.se> wrote:
On 2/5/25 11:00 AM, Daniel Gustafsson wrote:
This fell off the ever-growing TODO again. Re-reading it I still think it's a
good idea, it applied almost cleanly still and still gives a slight performance
improvement along with the more interesting refactoring which will make caching
of expressions easier down the line.
Attached is a rebased v4, unless there are objections I'll go ahead with this
version.+1 Still seems like a nice change to me too.
Thanks, I actually had it staged to go in early next week.
--
Daniel Gustafsson
On 9 Mar 2025, at 23:35, Daniel Gustafsson <daniel@yesql.se> wrote:
On 8 Mar 2025, at 17:15, Andreas Karlsson <andreas@proxel.se> wrote:
+1 Still seems like a nice change to me too.
Thanks, I actually had it staged to go in early next week.
..which was done yesterday, thanks for review!
--
Daniel Gustafsson