function calls optimization

Started by Andrzej Barszczover 6 years ago2 messageshackers

abusinf@gmail.com

over 6 years ago

Main goal of this patch is to avoid repeated calls of immutable/stable
functions.
This patch is against version 10.10.
I guess same logic could be implemented up till version 12.

Andres Freund

andres@anarazel.de

over 6 years ago

In reply to: Andrzej Barszcz (#1)

Re: function calls optimization

Hi,

On 2019-11-03 21:56:31 +0100, Andrzej Barszcz wrote:

Main goal of this patch is to avoid repeated calls of immutable/stable
functions.
This patch is against version 10.10.
I guess same logic could be implemented up till version 12.

If you want actual development feedback, you're more likely to get that
when posting patches against the master branch.

--- src/include/nodes/execnodes.h	2019-08-05 23:16:54.000000000 +0200
+++ src/include/nodes/execnodes.h	2019-11-03 20:05:34.338305825 +0100
@@ -882,6 +883,39 @@ typedef struct PlanState
TupleTableSlot *ps_ResultTupleSlot; /* slot for my result tuples */
ExprContext *ps_ExprContext;	/* node's expression-evaluation context */
ProjectionInfo *ps_ProjInfo;	/* info for doing tuple projection */
+#ifdef OPTFUNCALLS
+	/* was_called - list of  ExprEvalStep* or FuncExpr* depending on execution stage
+	 * 
+	 * Stage I. ExecInitExprRec()
+	 *	List gathers all not volatile, not set returning, not window FuncExpr*,
+	 *	equal nodes occupy one position in the list. Position in this list ( counting from 1 )
+	 *	and planstate are remembered in actual ExprEvalStep*
+	 *
+	 * 	For query: select f(n),f(n) from t  - was_called->length will be 1 and ptr_value 
+	 *		 will be FuncExpr* node of f(n)
+	 *
+	 * 	For query: select f(n),g(n),f(n) from t - list->length == 2
+	 *
+	 * Stage II. ExecProcnode()
+	 *	For every planstate->was_called list changes its interpretation - from now on
+	 *	it is a list of ExprEvalStep* . Before executing real execProcnode
+	 *	every element of this list ( ptr_value ) is set to NULL. We don't know which
+	 *	function will be called first
+	 *
+	 * Stage III. ExecInterpExpr() case EEOP_FUNCEXPR
+	 *	ExprEvalStep.position > 0  means that in planstate->was_called could be ExprEvalStep*
+	 *	which was done yet or NULL.
+	 *
+	 *	NULL means that eval step is entered first time and:
+	 *		1. real function must be called
+	 *		2. ExprEvalStep has to be remembered in planstate->was_called at position
+	 *		step->position - 1
+	 *
+	 *	NOT NULL means that in planstate->was_called is ExprEvalStep* with ready result, so
+	 *	there is no need to call function
+	 */
+	List *was_called;
+#endif
} PlanState;

I don't think the above describes a way to do this that is
acceptable. For one, I think this needs to happen at plan time, not for
every single execution of the statement. Nor do I think is it ok to make
ExecProcNode() etc slower for this feature - that's a very crucial
routine (similar with EEOP_FUNCEXPR checking the cache, but there we
could just have a different step type doing so).

Have you looked any of the previous work on such caching? I strongly
suggest doing so if you're interested in getting such a feature into
postgres. E.g. there's a fair bit of relevant discussion in
/messages/by-id/da87bb6a014e029176a04f6e50033cfb@postgrespro.ru
e.g. between Tom and me further down.

/* ----------------
--- src/include/executor/execExpr.h	2019-08-05 23:16:54.000000000 +0200
+++ src/include/executor/execExpr.h	2019-11-03 20:04:03.739025142 +0100
@@ -561,6 +561,10 @@ typedef struct ExprEvalStep
AlternativeSubPlanState *asstate;
}			alternative_subplan;
}			d;
+#ifdef OPTFUNCALLS
+	PlanState *planstate;	/* parent PlanState for this expression */
+	int position;		/* position in planstate->was_called counted from 1 */
+#endif
} ExprEvalStep;

This is not ok. We cannot just make every single ExprEvalStep larger for
this feature. Nor is clear why this is even needed - every ExprEvalStep
is associated with an ExprState, and ExprState already has a reference
to the parent planstate.

Greetings,

Andres Freund

function calls optimization

Attachments: