function calls optimization
Hi
Main goal of this patch is to avoid repeated calls of immutable/stable
functions.
This patch is against version 10.10.
I guess same logic could be implemented up till version 12.
Attachments:
postgresql-10.10-optfuncalls-v1.patchtext/x-patch; charset=US-ASCII; name=postgresql-10.10-optfuncalls-v1.patchDownload+167-9
Hi,
On 2019-11-03 21:56:31 +0100, Andrzej Barszcz wrote:
Main goal of this patch is to avoid repeated calls of immutable/stable
functions.
This patch is against version 10.10.
I guess same logic could be implemented up till version 12.
If you want actual development feedback, you're more likely to get that
when posting patches against the master branch.
--- src/include/nodes/execnodes.h 2019-08-05 23:16:54.000000000 +0200 +++ src/include/nodes/execnodes.h 2019-11-03 20:05:34.338305825 +0100 @@ -882,6 +883,39 @@ typedef struct PlanState TupleTableSlot *ps_ResultTupleSlot; /* slot for my result tuples */ ExprContext *ps_ExprContext; /* node's expression-evaluation context */ ProjectionInfo *ps_ProjInfo; /* info for doing tuple projection */ +#ifdef OPTFUNCALLS + /* was_called - list of ExprEvalStep* or FuncExpr* depending on execution stage + * + * Stage I. ExecInitExprRec() + * List gathers all not volatile, not set returning, not window FuncExpr*, + * equal nodes occupy one position in the list. Position in this list ( counting from 1 ) + * and planstate are remembered in actual ExprEvalStep* + * + * For query: select f(n),f(n) from t - was_called->length will be 1 and ptr_value + * will be FuncExpr* node of f(n) + * + * For query: select f(n),g(n),f(n) from t - list->length == 2 + * + * Stage II. ExecProcnode() + * For every planstate->was_called list changes its interpretation - from now on + * it is a list of ExprEvalStep* . Before executing real execProcnode + * every element of this list ( ptr_value ) is set to NULL. We don't know which + * function will be called first + * + * Stage III. ExecInterpExpr() case EEOP_FUNCEXPR + * ExprEvalStep.position > 0 means that in planstate->was_called could be ExprEvalStep* + * which was done yet or NULL. + * + * NULL means that eval step is entered first time and: + * 1. real function must be called + * 2. ExprEvalStep has to be remembered in planstate->was_called at position + * step->position - 1 + * + * NOT NULL means that in planstate->was_called is ExprEvalStep* with ready result, so + * there is no need to call function + */ + List *was_called; +#endif } PlanState;
I don't think the above describes a way to do this that is
acceptable. For one, I think this needs to happen at plan time, not for
every single execution of the statement. Nor do I think is it ok to make
ExecProcNode() etc slower for this feature - that's a very crucial
routine (similar with EEOP_FUNCEXPR checking the cache, but there we
could just have a different step type doing so).
Have you looked any of the previous work on such caching? I strongly
suggest doing so if you're interested in getting such a feature into
postgres. E.g. there's a fair bit of relevant discussion in
/messages/by-id/da87bb6a014e029176a04f6e50033cfb@postgrespro.ru
e.g. between Tom and me further down.
/* ---------------- --- src/include/executor/execExpr.h 2019-08-05 23:16:54.000000000 +0200 +++ src/include/executor/execExpr.h 2019-11-03 20:04:03.739025142 +0100 @@ -561,6 +561,10 @@ typedef struct ExprEvalStep AlternativeSubPlanState *asstate; } alternative_subplan; } d; +#ifdef OPTFUNCALLS + PlanState *planstate; /* parent PlanState for this expression */ + int position; /* position in planstate->was_called counted from 1 */ +#endif } ExprEvalStep;
This is not ok. We cannot just make every single ExprEvalStep larger for
this feature. Nor is clear why this is even needed - every ExprEvalStep
is associated with an ExprState, and ExprState already has a reference
to the parent planstate.
Greetings,
Andres Freund