PL/pgSQL and SPI

Started by Nonamealmost 27 years ago6 messages
#1Noname
jwieck@debis.com

Hi,

1. I've just committed some changes to PL/pgSQL and the SPI
manager.

It's a speedup of PL/pgSQL execution by calling
ExecEvalExpr() in the executor directly for simple
expressions that return one single Datum.

For the speed test I've removed all the setup stuff from
the plpgsql regression and ran the normal queries all in
one transaction. There are 196 query plans generated
during the regression and only 37 are left now for which
PL/pgSQL really calls SPI_execp().

This saves 30% of total execution time! I don't know how
much of the whole execution time is spent in PL/pgSQL and
how much is consumed by the normal query processing.

In another test I used a silly add function that simply
does a "return $1 + $2" and built a sum() aggregate on
top of it. In that case 65% of execution time to
summarize 20000 int4 values where saved. This is a
speedup by factor 3.

To be able to do so I've moved some of the declarations
from spi.c into a new header spi_priv.h so someone has
access to the _SPI_plan structure for past preparing
plan-/querytree analysis. And I've added two silly
functions SPI_push() and SPI_pop() that simply
increment/decrement the _SPI_curid value. This is
required for calling ExecEvalExpr(), because there could
be functions evaluated that use SPI themself and
otherwise they could not connect to the SPI manager. They
are dangerous and I'm in doubt if we should document
them.

2. While doing the above I've encountered some bad details
of the SPI manager and the executor. The Func and Oper
nodes point to a function cache, which is initially NULL
and is not copied by copyNode().

For every call of SPI_execp() to execute a prepared plan,
the whole plan is copied into the current memory context.
Since this clears out the fcache, the executor has to do
several syscache lookups for every function or operator
hit during execution of the plan.

Unfortunately I haven't found a way yet to avoid it.
Anything I tried so far ended in coredumps or other
misbehaviour. Maybe someone else has an idea.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#======================================== jwieck@debis.com (Jan Wieck) #

#2Vadim Mikheev
vadim@krs.ru
In reply to: Noname (#1)
Re: [HACKERS] PL/pgSQL and SPI

Jan Wieck wrote:

... And I've added two silly
functions SPI_push() and SPI_pop() that simply
increment/decrement the _SPI_curid value. This is
required for calling ExecEvalExpr(), because there could
be functions evaluated that use SPI themself and
otherwise they could not connect to the SPI manager. They
are dangerous and I'm in doubt if we should document

^^^^^^^^^
No more than improper call of SPI_finish()...

them.

2. While doing the above I've encountered some bad details
of the SPI manager and the executor. The Func and Oper
nodes point to a function cache, which is initially NULL
and is not copied by copyNode().

For every call of SPI_execp() to execute a prepared plan,
the whole plan is copied into the current memory context.
Since this clears out the fcache, the executor has to do
several syscache lookups for every function or operator
hit during execution of the plan.

Unfortunately I haven't found a way yet to avoid it.
Anything I tried so far ended in coredumps or other
misbehaviour. Maybe someone else has an idea.

Could we fill most of FunctionCache while parsing query ?!
We can do this for

int typlen; /* length of the return type */
int typbyval; /* true if return type is pass by value */
...
Oid foid; /* oid of the function in pg_proc */
Oid language; /* oid of the language in pg_language */
int nargs; /* number of arguments */

Oid *argOidVect; /* oids of all the arguments */
...
bool istrusted; /* trusted fn? */

and may be others too.

Vadim

#3Noname
jwieck@debis.com
In reply to: Vadim Mikheev (#2)
Re: [HACKERS] PL/pgSQL and SPI

Vadim wrote:

Jan Wieck wrote:

2. While doing the above I've encountered some bad details
of the SPI manager and the executor. The Func and Oper
nodes point to a function cache, which is initially NULL
and is not copied by copyNode().

For every call of SPI_execp() to execute a prepared plan,
the whole plan is copied into the current memory context.
Since this clears out the fcache, the executor has to do
several syscache lookups for every function or operator
hit during execution of the plan.

Unfortunately I haven't found a way yet to avoid it.
Anything I tried so far ended in coredumps or other
misbehaviour. Maybe someone else has an idea.

Could we fill most of FunctionCache while parsing query ?!
We can do this for

int typlen; /* length of the return type */
int typbyval; /* true if return type is pass by value */
...
Oid foid; /* oid of the function in pg_proc */
Oid language; /* oid of the language in pg_language */
int nargs; /* number of arguments */

Oid *argOidVect; /* oids of all the arguments */
...
bool istrusted; /* trusted fn? */

and may be others too.

And then letting copyNode() copy the fcache too so it's
allocated in the same memory context.

Will require a flag in the fcache that is used to tell that
setFcache() must be called to fill in the remaining fields
(there are some things taken from the actual executor state).
This flag is then cleared by copyNode() and the fields in
question left uncopied.

This might also let us get rid of the tree copy in
SPI_execp(), if we form another tree-traversal function that
resets the flag in all Func and Oper nodes of the whole tree,
so the prepared/saved plan can be used directly.

I'll give it a try some time.

Thanks for the kick, Vadim.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#======================================== jwieck@debis.com (Jan Wieck) #

#4Vadim Mikheev
vadim@krs.ru
In reply to: Noname (#3)
Re: [HACKERS] PL/pgSQL and SPI

Jan Wieck wrote:

Could we fill most of FunctionCache while parsing query ?!

And then letting copyNode() copy the fcache too so it's
allocated in the same memory context.

Or we could move these items from fcache struct to
Func/Oper node...

Will require a flag in the fcache that is used to tell that
setFcache() must be called to fill in the remaining fields
(there are some things taken from the actual executor state).
This flag is then cleared by copyNode() and the fields in
question left uncopied.

I missed here, please explain. What fields are you talking about?
Note that to support READ COMMITTED level I copy execution plan
_after_ execution started and so nothing used to keep execution
states, but not handled (re-initialized) by ExecInitNode,
must be copied.
Also, see below.

This might also let us get rid of the tree copy in
SPI_execp(), if we form another tree-traversal function that
resets the flag in all Func and Oper nodes of the whole tree,
so the prepared/saved plan can be used directly.

I'll give it a try some time.

Maybe. But note that if executor will try to use/pfree something
allocated in previous execution (in another memory context)
then we'll get trouble.

Vadim

#5Vadim Mikheev
vadim@krs.ru
In reply to: Noname (#1)
Re: [HACKERS] PL/pgSQL and SPI

Jan Wieck wrote:

1. I've just committed some changes to PL/pgSQL and the SPI
manager.

It's a speedup of PL/pgSQL execution by calling
ExecEvalExpr() in the executor directly for simple
expressions that return one single Datum.

...

To be able to do so I've moved some of the declarations
from spi.c into a new header spi_priv.h so someone has
access to the _SPI_plan structure for past preparing
plan-/querytree analysis. And I've added two silly
functions SPI_push() and SPI_pop() that simply
increment/decrement the _SPI_curid value. This is
required for calling ExecEvalExpr(), because there could
be functions evaluated that use SPI themself and
otherwise they could not connect to the SPI manager. They
are dangerous and I'm in doubt if we should document
them.

BTW, Jan, did you consider ability to add new function
for fast expression evaluation to SPI itself and than just
use this func in PL/pgSQL?
This function seems to be generally usefull.
And we could avoid SPI_push/SPI_pop...

Vadim

#6Noname
jwieck@debis.com
In reply to: Vadim Mikheev (#5)
Re: [HACKERS] PL/pgSQL and SPI

Vadim wrote:

Jan Wieck wrote:

1. I've just committed some changes to PL/pgSQL and the SPI
manager.

It's a speedup of PL/pgSQL execution by calling
ExecEvalExpr() in the executor directly for simple
expressions that return one single Datum.

BTW, Jan, did you consider ability to add new function
for fast expression evaluation to SPI itself and than just
use this func in PL/pgSQL?
This function seems to be generally usefull.
And we could avoid SPI_push/SPI_pop...

Clarification:

I'm doing many tests on the SPI generated plan to ensure
that it is so simple that ExecEvalExpr() cannot stumble
over it. In detail it must be something that has only one
targetentry, absolutely no qual, lefttree, righttree or
something else. And all the nodes in the TLE expression
must only be Expr (OP, FUNC, OR, AND, NOT only), Const or
Param ones.

This is required, because I have to fake an ExprContext
that contains the values for the parameters only. The
above ensures, that ExecEvalExpr() will never touch
anything else than the ecxt_param_list_info and thus will
not notice that it is a faked one.

Well, but you're right, I could add some smartness to SPI.
First, it could do the same checks on the generated plan that
ensure it really returns 1 (and only ever 1) Datum based only
on function calls, constants or parameters. If this is the
case, it could internally call ExecEvalExpr() and build a
faked heap tuple on SPI_execp(). Someone using SPI_exec()
isn't interested in speed, so I would leave it out there.

And two new functions

bool SPI_is_simple_expr(void *plan);
Datum SPI_eval_simple_expr(void *plan,
Datum *values,
char *Nulls,
bool *isNull,
Oid *rettype);

could gain more direct access to such expressions suppressing
the need to diddle with the SPI tuple table for getting just
one Datum.

Yes, I think it would be a good enhancement. I'll go for it.

Jan

--

#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#======================================== jwieck@debis.com (Jan Wieck) #