Why overhead of SPI is so large?
Hi, hackers.
One of our customers complains about slow execution of PL/pgSQL
functions comparing with Oracle.
So he wants to compile PL/pgSQL functions (most likely just-in-time
compilation).
Certainly interpreter adds quite large overhead comparing with native
code (~10 times) but
most of PL/pgSQL functions are just running some SQL queues and
iterating through results.
I can not believe that JIT can significantly speed-up such functions.
So I decided to make simple experiment: I created large enough table
and implemented functions
which calculates norm of one column in different languages.
Results are frustrating (at least for me):
PL/pgSQL: 29044.361 ms
C/SPI: 22785.597 ms
С/coreAPI: 2873.072 ms
PL/Lua: 33445.520 ms
SQL: 7397.639 ms (with parallel execution disabled)
The fact that difference between PL/pgSQL and function implemented in C
using SPI is not so large was expected by me.
But why it is more than 3 time slower than correspondent SQL query?
The answer seems to be in the third result: the same function in C
implemented without SPI (usign table_beginscan/heap_getnext)
Looks like SPI adds quite significant overhead.
And as far as almost all PL languages are using SPI, them all suffer
from it.
Below is profile of SPI function execution:
9.47% postgres libc-2.23.so [.] __memcpy_avx_unaligned
9.19% postgres spitest.so [.] spi_norm
8.09% postgres postgres [.] AllocSetAlloc
4.50% postgres postgres [.] tts_buffer_heap_getsomeattrs
4.36% postgres postgres [.] heap_form_tuple
3.41% postgres postgres [.] standard_ExecutorRun
3.35% postgres postgres [.] ExecScan
3.31% postgres postgres [.] palloc0
2.41% postgres postgres [.] heapgettup_pagemode
2.40% postgres postgres [.] AllocSetReset
2.25% postgres postgres [.] PopActiveSnapshot
2.17% postgres postgres [.] PortalRunFetch
2.16% postgres postgres [.] HeapTupleSatisfiesVisibility
1.97% postgres libc-2.23.so [.] __sigsetjmp
1.90% postgres postgres [.] _SPI_cursor_operation
1.87% postgres postgres [.] AllocSetFree
1.86% postgres postgres [.] PortalRunSelect
1.79% postgres postgres [.] heap_getnextslot
1.75% postgres postgres [.] heap_fill_tuple
1.70% postgres postgres [.] spi_dest_startup
1.50% postgres postgres [.] spi_printtup
1.49% postgres postgres [.] nocachegetattr
1.45% postgres postgres [.] MemoryContextDelete
1.44% postgres postgres [.] ExecJustAssignScanVar
1.38% postgres postgres [.] CreateTupleDescCopy
1.33% postgres postgres [.] SPI_getbinval
1.30% postgres postgres [.] PushActiveSnapshot
1.30% postgres postgres [.] AllocSetContextCreateInternal
1.22% postgres postgres [.] heap_compute_data_size
1.22% postgres postgres [.] MemoryContextCreate
1.14% postgres postgres [.] heapgetpage
1.05% postgres postgres [.] palloc
1.03% postgres postgres [.] SeqNext
As you can see, most of the time is spent in allocation and copying memory.
I wonder if somebody tried to address this problem and are there some
plans for improving speed of PL/pgSQL and other
stored languages?
I attached to this mail sources of spi_test extension with my experiments.
Please build it and run norm.sql file.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
From: Konstantin Knizhnik [mailto:k.knizhnik@postgrespro.ru]
PL/pgSQL: 29044.361 ms
C/SPI: 22785.597 msThe fact that difference between PL/pgSQL and function implemented in C
using SPI is not so large was expected by me.
This PL/pgSQL overhead is not so significant compared with the three times, but makes me desire some feature like Oracle's ALTER PROCEDURE ... COMPILE; that compiles the PL/SQL logic to native code. I've seen a few dozen percent speed up.
Regards
Takayuki Tsunakawa
Hello.
At Wed, 21 Aug 2019 19:41:08 +0300, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote in <ed9da20e-01aa-d04b-d085-e6c16b14b9d7@postgrespro.ru>
Hi, hackers.
One of our customers complains about slow execution of PL/pgSQL
functions comparing with Oracle.
So he wants to compile PL/pgSQL functions (most likely just-in-time
compilation).
Certainly interpreter adds quite large overhead comparing with native
code (~10 times) but
most of PL/pgSQL functions are just running some SQL queues and
iterating through results.I can not believe that JIT can significantly speed-up such functions.
So I decided to make simple experiment: I created large enough table
and implemented functions
which calculates norm of one column in different languages.Results are frustrating (at least for me):
PL/pgSQL: 29044.361 ms
C/SPI: 22785.597 ms
С/coreAPI: 2873.072 ms
PL/Lua: 33445.520 ms
SQL: 7397.639 ms (with parallel execution disabled)The fact that difference between PL/pgSQL and function implemented in
C using SPI is not so large was expected by me.
But why it is more than 3 time slower than correspondent SQL query?
The answer seems to be in the third result: the same function in C
implemented without SPI (usign table_beginscan/heap_getnext)
Looks like SPI adds quite significant overhead.
And as far as almost all PL languages are using SPI, them all suffer
from it.
As far as looking the attached spitest.c, it seems that the
overhead comes from cursor operation, not from SPI. As far as
spitest.c goes, cursor is useless. "SQL" and C/coreAPI seem to
be scanning over the result from *a single* query. If that's
correct, why don't you use SPI_execute() then scan over
SPI_tuptable?
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On 22.08.2019 5:40, Kyotaro Horiguchi wrote:
Hello.
At Wed, 21 Aug 2019 19:41:08 +0300, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote in <ed9da20e-01aa-d04b-d085-e6c16b14b9d7@postgrespro.ru>
Hi, hackers.
One of our customers complains about slow execution of PL/pgSQL
functions comparing with Oracle.
So he wants to compile PL/pgSQL functions (most likely just-in-time
compilation).
Certainly interpreter adds quite large overhead comparing with native
code (~10 times) but
most of PL/pgSQL functions are just running some SQL queues and
iterating through results.I can not believe that JIT can significantly speed-up such functions.
So I decided to make simple experiment: I created large enough table
and implemented functions
which calculates norm of one column in different languages.Results are frustrating (at least for me):
PL/pgSQL: 29044.361 ms
C/SPI: 22785.597 ms
С/coreAPI: 2873.072 ms
PL/Lua: 33445.520 ms
SQL: 7397.639 ms (with parallel execution disabled)The fact that difference between PL/pgSQL and function implemented in
C using SPI is not so large was expected by me.
But why it is more than 3 time slower than correspondent SQL query?
The answer seems to be in the third result: the same function in C
implemented without SPI (usign table_beginscan/heap_getnext)
Looks like SPI adds quite significant overhead.
And as far as almost all PL languages are using SPI, them all suffer
from it.As far as looking the attached spitest.c, it seems that the
overhead comes from cursor operation, not from SPI. As far as
spitest.c goes, cursor is useless. "SQL" and C/coreAPI seem to
be scanning over the result from *a single* query. If that's
correct, why don't you use SPI_execute() then scan over
SPI_tuptable?
Scanned table is very large and doesn't fir in memory.
This is why I am using SPI cursors.
Please let me know if there is more efficient way to traverse larger
table using SPI.
regards.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 22.08.2019 3:27, Tsunakawa, Takayuki wrote:
From: Konstantin Knizhnik [mailto:k.knizhnik@postgrespro.ru]
PL/pgSQL: 29044.361 ms
C/SPI: 22785.597 msThe fact that difference between PL/pgSQL and function implemented in C
using SPI is not so large was expected by me.This PL/pgSQL overhead is not so significant compared with the three times, but makes me desire some feature like Oracle's ALTER PROCEDURE ... COMPILE; that compiles the PL/SQL logic to native code. I've seen a few dozen percent speed up.
Actually my implementation of C/SPI version is not optimal: it is better
to fetch several records:
while (true)
{
SPI_cursor_fetch(portal, true, 100);
if (SPI_processed) {
for (i = 0; i < SPI_processed; i++) {
HeapTuple spi_tuple = SPI_tuptable->vals[i];
Datum val = SPI_getbinval(spi_tuple,
SPI_tuptable->tupdesc, 1, &is_null);
double x = DatumGetFloat8(val);
result += x*x;
SPI_freetuple(spi_tuple);
}
SPI_freetuptable(SPI_tuptable);
} else
break;
}
This version shows result 9405.694 ms which is comparable with result of
SQL query.
Unfortunately (or fortunately) PL/pgSQL is already using prefetch. If it
is disables (when iterate through explicitly created cursor), time of
query execution is increased almost twice (46552.935 ms)
So PL/SPI ratio is more than three times.
Updatede results are the following:
Impementation
time (ms)
PL/Lua 32220
PL/pgSQL 29044
PL/pgSQL (JIT)
27594
C/SPI 9406
SQL 7399
SQL (JIT)
5532
С/coreAPI 2873
--
Show quoted text
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Some more information...
First of all I found out that marking PL/pgSQL function as immutable
significantly increase speed of its execution:
19808 ms vs. 27594. It happens because exec_eval_simple_expr is taken
snapshot if function is volatile (default).
I wonder if PL/pgSQL compiler can detect that evaluated expression
itself is actually immutable and there is no need to take snapshot
for each invocation of this function. Also I have tried yet another PL
language - JavaScript, which is now new outsider, despite to the fact that
v8 JIT compiler is very good.
Implementation
time (ms)
PL/v8
41550
PL/Lua 32220
PL/pgSQL 19808
C/SPI 9406
SQL 7399
SQL (JIT)
5532
С/coreAPI 2873
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
Some more information...
First of all I found out that marking PL/pgSQL function as immutable
significantly increase speed of its execution:
19808 ms vs. 27594. It happens because exec_eval_simple_expr is taken
snapshot if function is volatile (default).
I wonder if PL/pgSQL compiler can detect that evaluated expression itself
is actually immutable and there is no need to take snapshot
for each invocation of this function. Also I have tried yet another PL
language - JavaScript, which is now new outsider, despite to the fact that
v8 JIT compiler is very good.
I have a plan to do some work in this direction. Snapshot is not necessary
for almost buildin functions. If expr calls only buildin functions, then
probably can be called without snapshot and without any work with plan
cache.
Pavel
Show quoted text
Implementation
time (ms)
PL/v8
41550
PL/Lua 32220
PL/pgSQL 19808
C/SPI 9406
SQL 7399
SQL (JIT)
5532
С/coreAPI 2873--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 22.08.2019 18:56, Pavel Stehule wrote:
čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> napsal:Some more information...
First of all I found out that marking PL/pgSQL function as
immutable significantly increase speed of its execution:
19808 ms vs. 27594. It happens because exec_eval_simple_expr is
taken snapshot if function is volatile (default).
I wonder if PL/pgSQL compiler can detect that evaluated expression
itself is actually immutable and there is no need to take snapshot
for each invocation of this function. Also I have tried yet
another PL language - JavaScript, which is now new outsider,
despite to the fact that
v8 JIT compiler is very good.I have a plan to do some work in this direction. Snapshot is not
necessary for almost buildin functions. If expr calls only buildin
functions, then probably can be called without snapshot and without
any work with plan cache.
I wonder if the following simple patch is correct?
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
plpgsql_exec_expr.patchtext/x-patch; name=plpgsql_exec_expr.patchDownload
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index a4697dc..98619be 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -6157,7 +6157,7 @@ exec_eval_simple_expr(PLpgSQL_execstate *estate,
* updates made so far by our own function.
*/
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
- if (!estate->readonly_func)
+ if (!estate->readonly_func && expr->expr_needs_snapshot)
{
CommandCounterIncrement();
PushActiveSnapshot(GetTransactionSnapshot());
@@ -6182,7 +6182,7 @@ exec_eval_simple_expr(PLpgSQL_execstate *estate,
estate->paramLI->parserSetupArg = save_setup_arg;
- if (!estate->readonly_func)
+ if (!estate->readonly_func && expr->expr_needs_snapshot)
PopActiveSnapshot();
MemoryContextSwitchTo(oldcontext);
@@ -7978,6 +7978,31 @@ exec_simple_check_plan(PLpgSQL_execstate *estate, PLpgSQL_expr *expr)
}
/*
+ * expr_needs_snapshot --- check if expression contains calls of non-immutable functions or subqueries
+ */
+static bool
+expr_needs_snapshot(Node* expr, void* ctx)
+{
+ if (expr == NULL)
+ return false;
+ switch (nodeTag(expr))
+ {
+ case T_FuncExpr:
+ if (func_volatile(((FuncExpr *)expr)->funcid) != PROVOLATILE_IMMUTABLE)
+ return true;
+ break;
+ case T_Query:
+ case T_SubPlan:
+ case T_AlternativeSubPlan:
+ case T_SubLink:
+ return true;
+ default:
+ break;
+ }
+ return expression_tree_walker(expr, expr_needs_snapshot, ctx);
+}
+
+/*
* exec_save_simple_expr --- extract simple expression from CachedPlan
*/
static void
@@ -8046,6 +8071,7 @@ exec_save_simple_expr(PLpgSQL_expr *expr, CachedPlan *cplan)
* current transaction".
*/
expr->expr_simple_expr = tle_expr;
+ expr->expr_needs_snapshot = expr_needs_snapshot(tle_expr, NULL);
expr->expr_simple_generation = cplan->generation;
expr->expr_simple_state = NULL;
expr->expr_simple_in_use = false;
diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h
index f66b2ba..454131f 100644
--- a/src/pl/plpgsql/src/plpgsql.h
+++ b/src/pl/plpgsql/src/plpgsql.h
@@ -243,6 +243,7 @@ typedef struct PLpgSQL_expr
*/
ExprState *expr_simple_state; /* eval tree for expr_simple_expr */
bool expr_simple_in_use; /* true if eval tree is active */
+ bool expr_needs_snapshot; /* true if simple expression calls non-immutable functions or performs subqueries */
LocalTransactionId expr_simple_lxid;
} PLpgSQL_expr;
pá 23. 8. 2019 v 11:05 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
On 22.08.2019 18:56, Pavel Stehule wrote:
čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:Some more information...
First of all I found out that marking PL/pgSQL function as immutable
significantly increase speed of its execution:
19808 ms vs. 27594. It happens because exec_eval_simple_expr is taken
snapshot if function is volatile (default).
I wonder if PL/pgSQL compiler can detect that evaluated expression itself
is actually immutable and there is no need to take snapshot
for each invocation of this function. Also I have tried yet another PL
language - JavaScript, which is now new outsider, despite to the fact that
v8 JIT compiler is very good.I have a plan to do some work in this direction. Snapshot is not necessary
for almost buildin functions. If expr calls only buildin functions, then
probably can be called without snapshot and without any work with plan
cache.I wonder if the following simple patch is correct?
You cannot to believe to user defined functions so immutable flag is
correct. Only buildin functions are 100% correct.
CREATE OR REPLACE FUNCTION foo()
RETURNS int AS $$
SELECT count(*) FROM pg_class;
$$ LANGUAGE sql IMMUTABLE;
is working.
But your patch is good enough for benchmarking.
Pavel
Show quoted text
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 23.08.2019 12:10, Pavel Stehule wrote:
pá 23. 8. 2019 v 11:05 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> napsal:On 22.08.2019 18:56, Pavel Stehule wrote:
čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>>
napsal:Some more information...
First of all I found out that marking PL/pgSQL function as
immutable significantly increase speed of its execution:
19808 ms vs. 27594. It happens because exec_eval_simple_expr
is taken snapshot if function is volatile (default).
I wonder if PL/pgSQL compiler can detect that evaluated
expression itself is actually immutable and there is no need
to take snapshot
for each invocation of this function. Also I have tried yet
another PL language - JavaScript, which is now new outsider,
despite to the fact that
v8 JIT compiler is very good.I have a plan to do some work in this direction. Snapshot is not
necessary for almost buildin functions. If expr calls only
buildin functions, then probably can be called without snapshot
and without any work with plan cache.I wonder if the following simple patch is correct?
You cannot to believe to user defined functions so immutable flag is
correct. Only buildin functions are 100% correct.CREATE OR REPLACE FUNCTION foo()
RETURNS int AS $$
SELECT count(*) FROM pg_class;
$$ LANGUAGE sql IMMUTABLE;is working.
But such definition of the function contradicts IMMUTABLE contract,
doesn't it?
If creator of the function incorrectly classify it, then usage of such
function can cause incorrect behavior.
For example, if function is marked as "parallel safe" but actually it is
not parallel safe, then using it in parallel plan may cause incorrect
results.
But it is a reason for disabling parallel plans for all user defined
functions, isn't it?
Also nothing terrible will happen in any case. If expression is calling
function which is marked is immutable but actually is not, then we can
just get old (deteriorated)
result of expression. Right now, if caller function (one containing
evaluated expression) is marked as non-volatile, then snapshot is also
not taken.
So if such function expression is calling foo() function as declared
above, then results will be also incorrect.
So I do not think some principle difference here and do not understand
why we should not believe user (function creator) only in this case.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
pá 23. 8. 2019 v 13:21 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
On 23.08.2019 12:10, Pavel Stehule wrote:
pá 23. 8. 2019 v 11:05 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:On 22.08.2019 18:56, Pavel Stehule wrote:
čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:Some more information...
First of all I found out that marking PL/pgSQL function as immutable
significantly increase speed of its execution:
19808 ms vs. 27594. It happens because exec_eval_simple_expr is taken
snapshot if function is volatile (default).
I wonder if PL/pgSQL compiler can detect that evaluated expression
itself is actually immutable and there is no need to take snapshot
for each invocation of this function. Also I have tried yet another PL
language - JavaScript, which is now new outsider, despite to the fact that
v8 JIT compiler is very good.I have a plan to do some work in this direction. Snapshot is not
necessary for almost buildin functions. If expr calls only buildin
functions, then probably can be called without snapshot and without any
work with plan cache.I wonder if the following simple patch is correct?
You cannot to believe to user defined functions so immutable flag is
correct. Only buildin functions are 100% correct.CREATE OR REPLACE FUNCTION foo()
RETURNS int AS $$
SELECT count(*) FROM pg_class;
$$ LANGUAGE sql IMMUTABLE;is working.
But such definition of the function contradicts IMMUTABLE contract,
doesn't it?
If creator of the function incorrectly classify it, then usage of such
function can cause incorrect behavior.
For example, if function is marked as "parallel safe" but actually it is
not parallel safe, then using it in parallel plan may cause incorrect
results.
But it is a reason for disabling parallel plans for all user defined
functions, isn't it?
In reality it is not IMMUTABLE function. On second hand, there are lot of
application that depends on this behave.
It is well know trick how to reduce estimation errors related to JOINs.
When immutable function has constant parameters, then it is evaluated in
planning time.
So sometimes was necessary to use
SELECT ... FROM tab WHERE foreign_key = immutable_function('constant
parameter')
instead JOIN.
It is ugly, but it is working perfectly. I think so until we will have
multi table statistics, this behave should be available in Postgres.
Sure, this function should not be used for functional indexes.
Show quoted text
Also nothing terrible will happen in any case. If expression is calling
function which is marked is immutable but actually is not, then we can
just get old (deteriorated)
result of expression. Right now, if caller function (one containing
evaluated expression) is marked as non-volatile, then snapshot is also not
taken.
So if such function expression is calling foo() function as declared
above, then results will be also incorrect.
So I do not think some principle difference here and do not understand why
we should not believe user (function creator) only in this case.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 23.08.2019 14:42, Pavel Stehule wrote:
In reality it is not IMMUTABLE function. On second hand, there are lot
of application that depends on this behave.It is well know trick how to reduce estimation errors related to
JOINs. When immutable function has constant parameters, then it is
evaluated in planning time.So sometimes was necessary to use
SELECT ... FROM tab WHERE foreign_key = immutable_function('constant
parameter')instead JOIN.
It is ugly, but it is working perfectly. I think so until we will have
multi table statistics, this behave should be available in Postgres.Sure, this function should not be used for functional indexes.
What about the following version of the patch?
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
plpgsql_exec_expr-2.patchtext/x-patch; name=plpgsql_exec_expr-2.patchDownload
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index a4697dc..c530c4d 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -6157,7 +6157,7 @@ exec_eval_simple_expr(PLpgSQL_execstate *estate,
* updates made so far by our own function.
*/
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
- if (!estate->readonly_func)
+ if (!estate->readonly_func && expr->expr_needs_snapshot)
{
CommandCounterIncrement();
PushActiveSnapshot(GetTransactionSnapshot());
@@ -6182,7 +6182,7 @@ exec_eval_simple_expr(PLpgSQL_execstate *estate,
estate->paramLI->parserSetupArg = save_setup_arg;
- if (!estate->readonly_func)
+ if (!estate->readonly_func && expr->expr_needs_snapshot)
PopActiveSnapshot();
MemoryContextSwitchTo(oldcontext);
@@ -7978,6 +7978,43 @@ exec_simple_check_plan(PLpgSQL_execstate *estate, PLpgSQL_expr *expr)
}
/*
+ * expr_needs_snapshot --- check if expression contains calls of non-immutable functions or subqueries
+ */
+static bool
+expr_needs_snapshot(Node* expr, void* ctx)
+{
+ if (expr == NULL)
+ return false;
+ switch (nodeTag(expr))
+ {
+ case T_FuncExpr:
+ {
+ FuncExpr *func = (FuncExpr *)expr;
+ if (func_volatile(func->funcid) != PROVOLATILE_IMMUTABLE
+ || !IsCatalogNamespace(get_func_namespace(func->funcid)))
+ return true;
+ break;
+ }
+ case T_OpExpr:
+ {
+ OpExpr *op = (OpExpr *)expr;
+ if (func_volatile(op->opfuncid) != PROVOLATILE_IMMUTABLE
+ || !IsCatalogNamespace(get_func_namespace(op->opfuncid)))
+ return true;
+ break;
+ }
+ case T_Query:
+ case T_SubPlan:
+ case T_AlternativeSubPlan:
+ case T_SubLink:
+ return true;
+ default:
+ break;
+ }
+ return expression_tree_walker(expr, expr_needs_snapshot, ctx);
+}
+
+/*
* exec_save_simple_expr --- extract simple expression from CachedPlan
*/
static void
@@ -8046,6 +8083,7 @@ exec_save_simple_expr(PLpgSQL_expr *expr, CachedPlan *cplan)
* current transaction".
*/
expr->expr_simple_expr = tle_expr;
+ expr->expr_needs_snapshot = expr_needs_snapshot(tle_expr, NULL);
expr->expr_simple_generation = cplan->generation;
expr->expr_simple_state = NULL;
expr->expr_simple_in_use = false;
diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h
index f66b2ba..454131f 100644
--- a/src/pl/plpgsql/src/plpgsql.h
+++ b/src/pl/plpgsql/src/plpgsql.h
@@ -243,6 +243,7 @@ typedef struct PLpgSQL_expr
*/
ExprState *expr_simple_state; /* eval tree for expr_simple_expr */
bool expr_simple_in_use; /* true if eval tree is active */
+ bool expr_needs_snapshot; /* true if simple expression calls non-immutable functions or performs subqueries */
LocalTransactionId expr_simple_lxid;
} PLpgSQL_expr;
On Fri, Aug 23, 2019 at 11:10:28AM +0200, Pavel Stehule wrote:
pá 23. 8. 2019 v 11:05 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:On 22.08.2019 18:56, Pavel Stehule wrote:
čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:Some more information...
First of all I found out that marking PL/pgSQL function as immutable
significantly increase speed of its execution:
19808 ms vs. 27594. It happens because exec_eval_simple_expr is taken
snapshot if function is volatile (default).
I wonder if PL/pgSQL compiler can detect that evaluated expression itself
is actually immutable and there is no need to take snapshot
for each invocation of this function. Also I have tried yet another PL
language - JavaScript, which is now new outsider, despite to the fact that
v8 JIT compiler is very good.I have a plan to do some work in this direction. Snapshot is not necessary
for almost buildin functions. If expr calls only buildin functions, then
probably can be called without snapshot and without any work with plan
cache.I wonder if the following simple patch is correct?
You cannot to believe to user defined functions so immutable flag is
correct. Only buildin functions are 100% correct.CREATE OR REPLACE FUNCTION foo()
RETURNS int AS $$
SELECT count(*) FROM pg_class;
$$ LANGUAGE sql IMMUTABLE;is working.
No, it's lying to the RDBMS, so it's pilot error. The problem of
determining from the function itself whether it is in fact immutable
is, in general, equivalent to the Halting Problem, so no, we can't
figure it out. We do need to trust our users not to lie to us, and we
do not need to protect them from the consequences when they do.
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
so 24. 8. 2019 v 18:01 odesílatel David Fetter <david@fetter.org> napsal:
On Fri, Aug 23, 2019 at 11:10:28AM +0200, Pavel Stehule wrote:
pá 23. 8. 2019 v 11:05 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:On 22.08.2019 18:56, Pavel Stehule wrote:
čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:Some more information...
First of all I found out that marking PL/pgSQL function as immutable
significantly increase speed of its execution:
19808 ms vs. 27594. It happens because exec_eval_simple_expr is taken
snapshot if function is volatile (default).
I wonder if PL/pgSQL compiler can detect that evaluated expressionitself
is actually immutable and there is no need to take snapshot
for each invocation of this function. Also I have tried yet another PL
language - JavaScript, which is now new outsider, despite to the factthat
v8 JIT compiler is very good.
I have a plan to do some work in this direction. Snapshot is not
necessary
for almost buildin functions. If expr calls only buildin functions,
then
probably can be called without snapshot and without any work with plan
cache.I wonder if the following simple patch is correct?
You cannot to believe to user defined functions so immutable flag is
correct. Only buildin functions are 100% correct.CREATE OR REPLACE FUNCTION foo()
RETURNS int AS $$
SELECT count(*) FROM pg_class;
$$ LANGUAGE sql IMMUTABLE;is working.
No, it's lying to the RDBMS, so it's pilot error. The problem of
determining from the function itself whether it is in fact immutable
is, in general, equivalent to the Halting Problem, so no, we can't
figure it out. We do need to trust our users not to lie to us, and we
do not need to protect them from the consequences when they do.
I have not any problem with fixing this behave when there will be any
alternative.
I can imagine new special flag that can be used for STABLE functions, that
enforce one shot plans and can be optimized similar like IMMUTABLE
functions now - using result in planning time.
The users lie because they must - there is not a alternative. There is not
any other solution - and estimation errors related to a joins are
fundamental issue.
Regards
Pavel
Show quoted text
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
On Sat, Aug 24, 2019 at 12:01 PM David Fetter <david@fetter.org> wrote:
No, it's lying to the RDBMS, so it's pilot error. The problem of
determining from the function itself whether it is in fact immutable
is, in general, equivalent to the Halting Problem, so no, we can't
figure it out. We do need to trust our users not to lie to us, and we
do not need to protect them from the consequences when they do.
Depends. I don't mind if mislabeling a function leads to "wrong"
query results, but I don't think it's OK for it to, say, crash the
server.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 24.08.2019 19:13, Pavel Stehule wrote:
so 24. 8. 2019 v 18:01 odesílatel David Fetter <david@fetter.org
<mailto:david@fetter.org>> napsal:On Fri, Aug 23, 2019 at 11:10:28AM +0200, Pavel Stehule wrote:
pá 23. 8. 2019 v 11:05 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>>napsal:
On 22.08.2019 18:56, Pavel Stehule wrote:
čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>>napsal:
Some more information...
First of all I found out that marking PL/pgSQL function asimmutable
significantly increase speed of its execution:
19808 ms vs. 27594. It happens because exec_eval_simple_expris taken
snapshot if function is volatile (default).
I wonder if PL/pgSQL compiler can detect that evaluatedexpression itself
is actually immutable and there is no need to take snapshot
for each invocation of this function. Also I have tried yetanother PL
language - JavaScript, which is now new outsider, despite to
the fact that
v8 JIT compiler is very good.
I have a plan to do some work in this direction. Snapshot is
not necessary
for almost buildin functions. If expr calls only buildin
functions, then
probably can be called without snapshot and without any work
with plan
cache.
I wonder if the following simple patch is correct?
You cannot to believe to user defined functions so immutable flag is
correct. Only buildin functions are 100% correct.CREATE OR REPLACE FUNCTION foo()
RETURNS int AS $$
SELECT count(*) FROM pg_class;
$$ LANGUAGE sql IMMUTABLE;is working.
No, it's lying to the RDBMS, so it's pilot error. The problem of
determining from the function itself whether it is in fact immutable
is, in general, equivalent to the Halting Problem, so no, we can't
figure it out. We do need to trust our users not to lie to us, and we
do not need to protect them from the consequences when they do.I have not any problem with fixing this behave when there will be any
alternative.I can imagine new special flag that can be used for STABLE functions,
that enforce one shot plans and can be optimized similar like
IMMUTABLE functions now - using result in planning time.The users lie because they must - there is not a alternative. There is
not any other solution - and estimation errors related to a joins are
fundamental issue.
Pavel, I wonder if I can put my patch (with fix which performs this
optimization only for built-in functions) to commitfest or you prefer to
do it yourself in some other way and propose your own solution?
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
pá 13. 9. 2019 v 9:09 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
On 24.08.2019 19:13, Pavel Stehule wrote:
so 24. 8. 2019 v 18:01 odesílatel David Fetter <david@fetter.org> napsal:
On Fri, Aug 23, 2019 at 11:10:28AM +0200, Pavel Stehule wrote:
pá 23. 8. 2019 v 11:05 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:On 22.08.2019 18:56, Pavel Stehule wrote:
čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:Some more information...
First of all I found out that marking PL/pgSQL function as immutable
significantly increase speed of its execution:
19808 ms vs. 27594. It happens because exec_eval_simple_expr is taken
snapshot if function is volatile (default).
I wonder if PL/pgSQL compiler can detect that evaluated expressionitself
is actually immutable and there is no need to take snapshot
for each invocation of this function. Also I have tried yet anotherPL
language - JavaScript, which is now new outsider, despite to the
fact that
v8 JIT compiler is very good.
I have a plan to do some work in this direction. Snapshot is not
necessary
for almost buildin functions. If expr calls only buildin functions,
then
probably can be called without snapshot and without any work with plan
cache.I wonder if the following simple patch is correct?
You cannot to believe to user defined functions so immutable flag is
correct. Only buildin functions are 100% correct.CREATE OR REPLACE FUNCTION foo()
RETURNS int AS $$
SELECT count(*) FROM pg_class;
$$ LANGUAGE sql IMMUTABLE;is working.
No, it's lying to the RDBMS, so it's pilot error. The problem of
determining from the function itself whether it is in fact immutable
is, in general, equivalent to the Halting Problem, so no, we can't
figure it out. We do need to trust our users not to lie to us, and we
do not need to protect them from the consequences when they do.I have not any problem with fixing this behave when there will be any
alternative.I can imagine new special flag that can be used for STABLE functions, that
enforce one shot plans and can be optimized similar like IMMUTABLE
functions now - using result in planning time.The users lie because they must - there is not a alternative. There is not
any other solution - and estimation errors related to a joins are
fundamental issue.Pavel, I wonder if I can put my patch (with fix which performs this
optimization only for built-in functions) to commitfest or you prefer to do
it yourself in some other way and propose your own solution?
I think so your patch is good enough for commitfest.
It doesn't remove all overhead - I think so there is lot of overhead
related to plan cache, but it in good direction.
Probably for these expressions is our final target using a cached JIT - but
nobody knows when it will be. I'll not have to time for my experiments
before October.
Show quoted text
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 13.09.2019 10:16, Pavel Stehule wrote:
pá 13. 9. 2019 v 9:09 odesílatel Konstantin Knizhnik
<k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> napsal:On 24.08.2019 19:13, Pavel Stehule wrote:
so 24. 8. 2019 v 18:01 odesílatel David Fetter <david@fetter.org
<mailto:david@fetter.org>> napsal:On Fri, Aug 23, 2019 at 11:10:28AM +0200, Pavel Stehule wrote:
pá 23. 8. 2019 v 11:05 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru<mailto:k.knizhnik@postgrespro.ru>> napsal:
On 22.08.2019 18:56, Pavel Stehule wrote:
čt 22. 8. 2019 v 17:51 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru<mailto:k.knizhnik@postgrespro.ru>> napsal:
Some more information...
First of all I found out that marking PL/pgSQL functionas immutable
significantly increase speed of its execution:
19808 ms vs. 27594. It happens becauseexec_eval_simple_expr is taken
snapshot if function is volatile (default).
I wonder if PL/pgSQL compiler can detect that evaluatedexpression itself
is actually immutable and there is no need to take snapshot
for each invocation of this function. Also I have triedyet another PL
language - JavaScript, which is now new outsider,
despite to the fact that
v8 JIT compiler is very good.
I have a plan to do some work in this direction. Snapshot
is not necessary
for almost buildin functions. If expr calls only buildin
functions, then
probably can be called without snapshot and without any
work with plan
cache.
I wonder if the following simple patch is correct?
You cannot to believe to user defined functions so
immutable flag is
correct. Only buildin functions are 100% correct.
CREATE OR REPLACE FUNCTION foo()
RETURNS int AS $$
SELECT count(*) FROM pg_class;
$$ LANGUAGE sql IMMUTABLE;is working.
No, it's lying to the RDBMS, so it's pilot error. The problem of
determining from the function itself whether it is in fact
immutable
is, in general, equivalent to the Halting Problem, so no, we
can't
figure it out. We do need to trust our users not to lie to
us, and we
do not need to protect them from the consequences when they do.I have not any problem with fixing this behave when there will be
any alternative.I can imagine new special flag that can be used for STABLE
functions, that enforce one shot plans and can be optimized
similar like IMMUTABLE functions now - using result in planning time.The users lie because they must - there is not a alternative.
There is not any other solution - and estimation errors related
to a joins are fundamental issue.Pavel, I wonder if I can put my patch (with fix which performs
this optimization only for built-in functions) to commitfest or
you prefer to do it yourself in some other way and propose your
own solution?I think so your patch is good enough for commitfest.
It doesn't remove all overhead - I think so there is lot of overhead
related to plan cache, but it in good direction.Probably for these expressions is our final target using a cached JIT
- but nobody knows when it will be. I'll not have to time for my
experiments before October.
This is profile of execution of PL/pgSQL function with my patch:
5.39% postgres plpgsql.so [.] exec_assign_value
5.10% postgres postgres [.] ExecInterpExpr
4.70% postgres postgres [.] tts_buffer_heap_getsomeattrs
4.56% postgres plpgsql.so [.] exec_move_row_from_fields
3.87% postgres postgres [.] ExecScan
3.74% postgres plpgsql.so [.] exec_eval_expr
3.64% postgres postgres [.] heap_form_tuple
3.13% postgres postgres [.] heap_fill_tuple
3.07% postgres postgres [.] heapgettup_pagemode
2.95% postgres postgres [.] heap_deform_tuple
2.92% postgres plpgsql.so [.] plpgsql_param_eval_var
2.64% postgres postgres [.] HeapTupleSatisfiesVisibility
2.61% postgres postgres [.] AcquirePlannerLocks
2.58% postgres postgres [.] AcquireExecutorLocks
2.43% postgres postgres [.] GetCachedPlan
2.26% postgres plpgsql.so [.] exec_stmt
2.23% postgres plpgsql.so [.] exec_cast_value
1.89% postgres postgres [.] AllocSetAlloc
1.75% postgres postgres [.] palloc0
1.73% postgres plpgsql.so [.] exec_move_row
1.73% postgres postgres [.] OverrideSearchPathMatchesCurrent
1.69% postgres plpgsql.so [.] assign_simple_var
1.63% postgres postgres [.] heap_getnextslot
1.60% postgres postgres [.] SPI_plan_get_cached_plan
1.55% postgres postgres [.] heapgetpage
1.47% postgres postgres [.] heap_compute_data_size
1.46% postgres postgres [.] spi_printtup
1.43% postgres postgres [.] float8mul
1.37% postgres postgres [.] RevalidateCachedQuery
1.36% postgres postgres [.] standard_ExecutorRun
1.35% postgres postgres [.] recomputeNamespacePath
1.28% postgres postgres [.] ExecStoreBufferHeapTuple
1.25% postgres postgres [.] MemoryContextReset
1.22% postgres plpgsql.so [.] exec_eval_cleanup.isra.18
1.20% postgres plpgsql.so [.] exec_assign_expr
1.05% postgres postgres [.] SeqNext
1.04% postgres postgres [.] ResourceArrayRemove
1.00% postgres postgres [.] ScanQueryForLocks
Based on this profile it seems to me that plan cache overhead is
relatively small:
2.43%+1.60%+1.37% < 6%
But from the other side ExecInterpExpr itself takes also about 5%.
I do not completely understand why JIT is not currently used for
evaluation of SPI expressions
(why we call ExecInterpExpr and do not try to compile this expression
even if JIT is enabled).
But event if we do it and improve speed of expression evaluation 10 or
more time, looks like
that effect on total query execution time will be also negligible (5%).
Most of the time is spent in pl_exec code, heap traversal , unpacking
and copying tuple data.
Looks like it can not be easily optimized and requires serious rewriting
of PL/pgSQL stuff.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Hi
I testing very simple function
create or replace function f1(int) returns int as $$ declare i int = 0;
begin while i < $1 loop i = i + 1; end loop; return i; end $$ language
plpgsql immutable;
profile - when function is marked as immutable
8,65% postgres [.] ExecInterpExpr
▒
8,59% postgres [.] AcquireExecutorLocks
▒
6,95% postgres [.] OverrideSearchPathMatchesCurrent
▒
5,72% plpgsql.so [.] plpgsql_param_eval_var
▒
5,15% postgres [.] AcquirePlannerLocks
▒
4,54% postgres [.] RevalidateCachedQuery
▒
4,52% postgres [.] GetCachedPlan
▒
3,82% postgres [.] ResourceArrayRemove
▒
2,87% postgres [.] SPI_plan_get_cached_plan
▒
2,80% plpgsql.so [.] exec_eval_expr
▒
2,70% plpgsql.so [.] exec_assign_value
▒
2,55% plpgsql.so [.] exec_stmt
▒
2,53% postgres [.] recomputeNamespacePath
▒
2,39% plpgsql.so [.] exec_cast_value
▒
2,19% postgres [.] int4pl
▒
2,13% postgres [.] int4lt
▒
1,98% postgres [.] CheckCachedPlan
volatile
7,21% postgres [.] GetSnapshotData
6,92% plpgsql.so [.] exec_eval_simple_expr
5,79% postgres [.] AcquireExecutorLocks
5,57% postgres [.] ExecInterpExpr
4,12% postgres [.] LWLockRelease
3,68% postgres [.] OverrideSearchPathMatchesCurrent
3,64% postgres [.] PopActiveSnapshot
3,36% plpgsql.so [.] plpgsql_param_eval_var
3,31% postgres [.] LWLockAttemptLock
3,13% postgres [.] AllocSetAlloc
2,91% postgres [.] GetCachedPlan
2,79% postgres [.] MemoryContextAlloc
2,76% postgres [.] AcquirePlannerLocks
2,70% postgres [.] ResourceArrayRemove
2,45% postgres [.] PushActiveSnapshot
2,44% postgres [.] RevalidateCachedQuery
2,29% postgres [.] SPI_plan_get_cached_plan
2,18% postgres [.] CopySnapshot
1,95% postgres [.] AllocSetFree
1,81% postgres [.] LWLockAcquire
1,71% plpgsql.so [.] exec_assign_value
1,61% plpgsql.so [.] exec_stmt
1,59% plpgsql.so [.] exec_eval_expr
1,48% postgres [.] int4pl
1,48% postgres [.] CheckCachedPlan
1,40% plpgsql.so [.] exec_cast_value
1,39% postgres [.] int4lt
1,38% postgres [.] recomputeNamespacePath
1,25% plpgsql.so [.] exec_eval_cleanup
1,08% postgres [.] ScanQueryForLocks
1,01% plpgsql.so [.] exec_eval_boolean
1,00% postgres [.] pfree
For tested function almost all CPU should be used for int4pl and int4lt
functions - but there are used only 4% together. I think so almost all of
8,59% postgres [.] AcquireExecutorLocks
▒
6,95% postgres [.] OverrideSearchPathMatchesCurrent
▒
5,72% plpgsql.so [.] plpgsql_param_eval_var
▒
5,15% postgres [.] AcquirePlannerLocks
▒
4,54% postgres [.] RevalidateCachedQuery
▒
4,52% postgres [.] GetCachedPlan
▒
3,82% postgres [.] ResourceArrayRemove
▒
2,87% postgres [.] SPI_plan_get_cached_plan
▒
2,53% postgres [.] recomputeNamespacePath
▒
can be reduced if we know so we should to call just builtin immutable V1
functions.
My example is a extrem - when you use any embedded SQL, then the profile
will be significantly changed. But for some cases there can be nice some
significant speedup of expressions only functions (like PostGIS)
Regards
Pavel
Hi
pá 23. 8. 2019 v 16:32 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
On 23.08.2019 14:42, Pavel Stehule wrote:
In reality it is not IMMUTABLE function. On second hand, there are lot of
application that depends on this behave.It is well know trick how to reduce estimation errors related to JOINs.
When immutable function has constant parameters, then it is evaluated in
planning time.So sometimes was necessary to use
SELECT ... FROM tab WHERE foreign_key = immutable_function('constant
parameter')instead JOIN.
It is ugly, but it is working perfectly. I think so until we will have
multi table statistics, this behave should be available in Postgres.Sure, this function should not be used for functional indexes.
What about the following version of the patch?
I am sending review of this small patch.
This small patch reduce a overhead of usage buildin immutable functions in
volatile functions with simple trick. Starts snapshot only when it is
necessary.
In decrease runtime time about 25 % on this small example.
do $$
declare i int;
begin
i := 0;
while i < 10000000
loop
i := i + 1;
end loop;
end;
$$;
If there are more expressions, then speedup can be more interesting. If
there are other bottlenecks, then the speedup will be less. 25% is not bad,
so we want to this feature.
I believe so similar method can be used more aggressively with more
significant performance benefit, but this is low hanging fruit and isn't
reason to wait for future.
This patch doesn't introduce any new feature, so new tests and new doc is
not necessary.
The patch is readable, well formatted, only comments are too long. I fixed
it.
All tests passed
I fixed few warnings, and I reformated little bit function
expr_needs_snapshot to use if instead case, what is more usual in these
cases.
I think so this code can be marked as ready for commit
Regards
Pavel
Show quoted text
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
plpgsql_exec_expr-3.patchtext/x-patch; charset=US-ASCII; name=plpgsql_exec_expr-3.patchDownload
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index fd799b724d..fbe4708736 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -21,6 +21,7 @@
#include "access/htup_details.h"
#include "access/transam.h"
#include "access/tupconvert.h"
+#include "catalog/catalog.h"
#include "catalog/pg_proc.h"
#include "catalog/pg_type.h"
#include "commands/defrem.h"
@@ -6155,7 +6156,7 @@ exec_eval_simple_expr(PLpgSQL_execstate *estate,
* updates made so far by our own function.
*/
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
- if (!estate->readonly_func)
+ if (!estate->readonly_func && expr->expr_needs_snapshot)
{
CommandCounterIncrement();
PushActiveSnapshot(GetTransactionSnapshot());
@@ -6180,7 +6181,7 @@ exec_eval_simple_expr(PLpgSQL_execstate *estate,
estate->paramLI->parserSetupArg = save_setup_arg;
- if (!estate->readonly_func)
+ if (!estate->readonly_func && expr->expr_needs_snapshot)
PopActiveSnapshot();
MemoryContextSwitchTo(oldcontext);
@@ -7975,6 +7976,39 @@ exec_simple_check_plan(PLpgSQL_execstate *estate, PLpgSQL_expr *expr)
ReleaseCachedPlan(cplan, true);
}
+/*
+ * expr_needs_snapshot --- check if expression contains calls of non-immutable
+ * functions or subqueries
+ */
+static bool
+expr_needs_snapshot(Node *expr, void *ctx)
+{
+ if (expr == NULL)
+ return false;
+
+ if (IsA(expr, FuncExpr))
+ {
+ FuncExpr *func = (FuncExpr *) expr;
+
+ if ((func_volatile(func->funcid) != PROVOLATILE_IMMUTABLE) ||
+ !IsCatalogNamespace(get_func_namespace(func->funcid)))
+ return true;
+ }
+ else if (IsA(expr, OpExpr))
+ {
+ OpExpr *op = (OpExpr *) expr;
+
+ if ((func_volatile(op->opfuncid) != PROVOLATILE_IMMUTABLE) ||
+ !IsCatalogNamespace(get_func_namespace(op->opfuncid)))
+ return true;
+ }
+ else if (IsA(expr, Query) || IsA(expr, SubPlan) ||
+ IsA(expr, AlternativeSubPlan) || IsA(expr, SubLink))
+ return true;
+
+ return expression_tree_walker(expr, expr_needs_snapshot, ctx);
+}
+
/*
* exec_save_simple_expr --- extract simple expression from CachedPlan
*/
@@ -8044,6 +8078,7 @@ exec_save_simple_expr(PLpgSQL_expr *expr, CachedPlan *cplan)
* current transaction".
*/
expr->expr_simple_expr = tle_expr;
+ expr->expr_needs_snapshot = expr_needs_snapshot((Node *) tle_expr, NULL);
expr->expr_simple_generation = cplan->generation;
expr->expr_simple_state = NULL;
expr->expr_simple_in_use = false;
diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h
index f66b2baa70..b745f5b3a5 100644
--- a/src/pl/plpgsql/src/plpgsql.h
+++ b/src/pl/plpgsql/src/plpgsql.h
@@ -243,6 +243,13 @@ typedef struct PLpgSQL_expr
*/
ExprState *expr_simple_state; /* eval tree for expr_simple_expr */
bool expr_simple_in_use; /* true if eval tree is active */
+
+ /*
+ * when expression use only buildin immutable functions, then it
+ * doesn't need special snapshot. For other cases this value is true
+ * for volatile functions and false for any other.
+ */
+ bool expr_needs_snapshot;
LocalTransactionId expr_simple_lxid;
} PLpgSQL_expr;
Hello.
At Tue, 5 Nov 2019 22:14:40 +0100, Pavel Stehule <pavel.stehule@gmail.com> wrote in
Hi
pá 23. 8. 2019 v 16:32 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:On 23.08.2019 14:42, Pavel Stehule wrote:
In reality it is not IMMUTABLE function. On second hand, there are lot of
application that depends on this behave.It is well know trick how to reduce estimation errors related to JOINs.
When immutable function has constant parameters, then it is evaluated in
planning time.So sometimes was necessary to use
SELECT ... FROM tab WHERE foreign_key = immutable_function('constant
parameter')instead JOIN.
It is ugly, but it is working perfectly. I think so until we will have
multi table statistics, this behave should be available in Postgres.Sure, this function should not be used for functional indexes.
What about the following version of the patch?
I am sending review of this small patch.
This small patch reduce a overhead of usage buildin immutable functions in
volatile functions with simple trick. Starts snapshot only when it is
necessary.In decrease runtime time about 25 % on this small example.
do $$
declare i int;
begin
i := 0;
while i < 10000000
loop
i := i + 1;
end loop;
end;
$$;If there are more expressions, then speedup can be more interesting. If
there are other bottlenecks, then the speedup will be less. 25% is not bad,
so we want to this feature.I believe so similar method can be used more aggressively with more
significant performance benefit, but this is low hanging fruit and isn't
reason to wait for future.This patch doesn't introduce any new feature, so new tests and new doc is
not necessary.
The patch is readable, well formatted, only comments are too long. I fixed
it.
All tests passed
I fixed few warnings, and I reformated little bit function
expr_needs_snapshot to use if instead case, what is more usual in these
cases.I think so this code can be marked as ready for commit
I have some comments on this.
expr_needs_snapshot checks out some of the node already checked out in
exec_simple_check_plan but not all. However I don't see the criteria
of the choice.
I might be too worrying, but maybe we should write the function in
white-listed way, that is, expr_needs_snapshot returns false only if
the whole tree consists of only the node types known to the
function. And any unknown node makes the function return true
immediately.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
čt 7. 11. 2019 v 13:03 odesílatel Kyotaro Horiguchi <horikyota.ntt@gmail.com>
napsal:
Hello.
At Tue, 5 Nov 2019 22:14:40 +0100, Pavel Stehule <pavel.stehule@gmail.com>
wrote inHi
pá 23. 8. 2019 v 16:32 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:On 23.08.2019 14:42, Pavel Stehule wrote:
In reality it is not IMMUTABLE function. On second hand, there are lot
of
application that depends on this behave.
It is well know trick how to reduce estimation errors related to JOINs.
When immutable function has constant parameters, then it is evaluatedin
planning time.
So sometimes was necessary to use
SELECT ... FROM tab WHERE foreign_key = immutable_function('constant
parameter')instead JOIN.
It is ugly, but it is working perfectly. I think so until we will have
multi table statistics, this behave should be available in Postgres.Sure, this function should not be used for functional indexes.
What about the following version of the patch?
I am sending review of this small patch.
This small patch reduce a overhead of usage buildin immutable functions
in
volatile functions with simple trick. Starts snapshot only when it is
necessary.In decrease runtime time about 25 % on this small example.
do $$
declare i int;
begin
i := 0;
while i < 10000000
loop
i := i + 1;
end loop;
end;
$$;If there are more expressions, then speedup can be more interesting. If
there are other bottlenecks, then the speedup will be less. 25% is notbad,
so we want to this feature.
I believe so similar method can be used more aggressively with more
significant performance benefit, but this is low hanging fruit and isn't
reason to wait for future.This patch doesn't introduce any new feature, so new tests and new doc is
not necessary.
The patch is readable, well formatted, only comments are too long. Ifixed
it.
All tests passed
I fixed few warnings, and I reformated little bit function
expr_needs_snapshot to use if instead case, what is more usual in these
cases.I think so this code can be marked as ready for commit
I have some comments on this.
expr_needs_snapshot checks out some of the node already checked out in
exec_simple_check_plan but not all. However I don't see the criteria
of the choice.I might be too worrying, but maybe we should write the function in
white-listed way, that is, expr_needs_snapshot returns false only if
the whole tree consists of only the node types known to the
function. And any unknown node makes the function return true
immediately.
has sense
Pavel
Show quoted text
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On 07.11.2019 15:09, Pavel Stehule wrote:
čt 7. 11. 2019 v 13:03 odesílatel Kyotaro Horiguchi
<horikyota.ntt@gmail.com <mailto:horikyota.ntt@gmail.com>> napsal:Hello.
At Tue, 5 Nov 2019 22:14:40 +0100, Pavel Stehule
<pavel.stehule@gmail.com <mailto:pavel.stehule@gmail.com>> wrote inHi
pá 23. 8. 2019 v 16:32 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>>napsal:
On 23.08.2019 14:42, Pavel Stehule wrote:
In reality it is not IMMUTABLE function. On second hand, there
are lot of
application that depends on this behave.
It is well know trick how to reduce estimation errors related
to JOINs.
When immutable function has constant parameters, then it is
evaluated in
planning time.
So sometimes was necessary to use
SELECT ... FROM tab WHERE foreign_key =
immutable_function('constant
parameter')
instead JOIN.
It is ugly, but it is working perfectly. I think so until we
will have
multi table statistics, this behave should be available in
Postgres.
Sure, this function should not be used for functional indexes.
What about the following version of the patch?
I am sending review of this small patch.
This small patch reduce a overhead of usage buildin immutable
functions in
volatile functions with simple trick. Starts snapshot only when
it is
necessary.
In decrease runtime time about 25 % on this small example.
do $$
declare i int;
begin
i := 0;
while i < 10000000
loop
i := i + 1;
end loop;
end;
$$;If there are more expressions, then speedup can be more
interesting. If
there are other bottlenecks, then the speedup will be less. 25%
is not bad,
so we want to this feature.
I believe so similar method can be used more aggressively with more
significant performance benefit, but this is low hanging fruitand isn't
reason to wait for future.
This patch doesn't introduce any new feature, so new tests and
new doc is
not necessary.
The patch is readable, well formatted, only comments are toolong. I fixed
it.
All tests passed
I fixed few warnings, and I reformated little bit function
expr_needs_snapshot to use if instead case, what is more usualin these
cases.
I think so this code can be marked as ready for commit
I have some comments on this.
expr_needs_snapshot checks out some of the node already checked out in
exec_simple_check_plan but not all. However I don't see the criteria
of the choice.I might be too worrying, but maybe we should write the function in
white-listed way, that is, expr_needs_snapshot returns false only if
the whole tree consists of only the node types known to the
function. And any unknown node makes the function return true
immediately.has sense
There are 62 cases handled by expression_tree_walker.
I have inspected all of them and considered only these 6 to be
"dangerous": intentionally require snapshot.
FuncExpr, OpExpr, Query, SubPlan, AlternativeSubPlan, SubLink.
So if I use "white-list" approach, I have to enumerate all other 62-6=56
cases in expr_needs_snapshot function.
Frankly speaking I do not this that it is good idea. New nodes are
rarely added to the Postgres and expression_tree_walker
in any case has to be changed to handle this new nodes. There are many
other places where tree walker is used to check presence (or absence)
of some properties in the tree. And none is this places assume that some
newly introduced node may require special handling of this property check.
In any case, if you think that explicit enumerations of this 56 cases
(or some subset of them) is good idea, then I will do it.
If you have concerns about some particular nodes, I can add them to
"black list".
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
pá 8. 11. 2019 v 14:31 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
On 07.11.2019 15:09, Pavel Stehule wrote:
čt 7. 11. 2019 v 13:03 odesílatel Kyotaro Horiguchi <
horikyota.ntt@gmail.com> napsal:Hello.
At Tue, 5 Nov 2019 22:14:40 +0100, Pavel Stehule <pavel.stehule@gmail.com>
wrote inHi
pá 23. 8. 2019 v 16:32 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:On 23.08.2019 14:42, Pavel Stehule wrote:
In reality it is not IMMUTABLE function. On second hand, there are
lot of
application that depends on this behave.
It is well know trick how to reduce estimation errors related to
JOINs.
When immutable function has constant parameters, then it is evaluated
in
planning time.
So sometimes was necessary to use
SELECT ... FROM tab WHERE foreign_key = immutable_function('constant
parameter')instead JOIN.
It is ugly, but it is working perfectly. I think so until we will have
multi table statistics, this behave should be available in Postgres.Sure, this function should not be used for functional indexes.
What about the following version of the patch?
I am sending review of this small patch.
This small patch reduce a overhead of usage buildin immutable functions
in
volatile functions with simple trick. Starts snapshot only when it is
necessary.In decrease runtime time about 25 % on this small example.
do $$
declare i int;
begin
i := 0;
while i < 10000000
loop
i := i + 1;
end loop;
end;
$$;If there are more expressions, then speedup can be more interesting. If
there are other bottlenecks, then the speedup will be less. 25% is notbad,
so we want to this feature.
I believe so similar method can be used more aggressively with more
significant performance benefit, but this is low hanging fruit and isn't
reason to wait for future.This patch doesn't introduce any new feature, so new tests and new doc
is
not necessary.
The patch is readable, well formatted, only comments are too long. Ifixed
it.
All tests passed
I fixed few warnings, and I reformated little bit function
expr_needs_snapshot to use if instead case, what is more usual in these
cases.I think so this code can be marked as ready for commit
I have some comments on this.
expr_needs_snapshot checks out some of the node already checked out in
exec_simple_check_plan but not all. However I don't see the criteria
of the choice.I might be too worrying, but maybe we should write the function in
white-listed way, that is, expr_needs_snapshot returns false only if
the whole tree consists of only the node types known to the
function. And any unknown node makes the function return true
immediately.has sense
There are 62 cases handled by expression_tree_walker.
I have inspected all of them and considered only these 6 to be
"dangerous": intentionally require snapshot.
FuncExpr, OpExpr, Query, SubPlan, AlternativeSubPlan, SubLink.So if I use "white-list" approach, I have to enumerate all other 62-6=56
cases in expr_needs_snapshot function.
Frankly speaking I do not this that it is good idea. New nodes are rarely
added to the Postgres and expression_tree_walker
in any case has to be changed to handle this new nodes. There are many
other places where tree walker is used to check presence (or absence)
of some properties in the tree. And none is this places assume that some
newly introduced node may require special handling of this property check.In any case, if you think that explicit enumerations of this 56 cases (or
some subset of them) is good idea, then I will do it.
If you have concerns about some particular nodes, I can add them to
"black list".
Is question how to implement this feature safely. Isn't possible to use
same check for functional indexes?
Show quoted text
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Konstantin Knizhnik <k.knizhnik@postgrespro.ru> writes:
čt 7. 11. 2019 v 13:03 odesílatel Kyotaro Horiguchi
<horikyota.ntt@gmail.com <mailto:horikyota.ntt@gmail.com>> napsal:I might be too worrying, but maybe we should write the function in
white-listed way, that is, expr_needs_snapshot returns false only if
the whole tree consists of only the node types known to the
function. And any unknown node makes the function return true
immediately.
There are 62 cases handled by expression_tree_walker.
I have inspected all of them and considered only these 6 to be
"dangerous": intentionally require snapshot.
FuncExpr, OpExpr, Query, SubPlan, AlternativeSubPlan, SubLink.
This is false on its face. For example, considering OpExpr but not
its aliases DistinctExpr or NullIfExpr is just obviously wrong.
ScalarArrayOpExpr is another node that might contain a user-defined
function. CoerceViaIO calls I/O functions that might easily not be
immutable. RowCompareExpr calls comparison functions that might
not be either (they are btree comparison functions, but they could
be cross-type comparisons, and we only require those to be stable).
CoerceToDomain could invoke just about anything at all. Need I
go on?
Frankly speaking I do not this that it is good idea. New nodes are
rarely added to the Postgres and expression_tree_walker
in any case has to be changed to handle this new nodes. There are many
other places where tree walker is used to check presence (or absence)
of some properties in the tree. And none is this places assume that some
newly introduced node may require special handling of this property check.
None of those statements are true, in my experience.
In general, this patch seems like it's learned nothing from our
experiences with the late and unlamented exec_simple_check_node()
(cf commit 00418c612). Having a piece of plpgsql that has to know
about all possible "simple expression" node types is just a major
maintenance headache; but there is no short-cut solution that is
really going to be acceptable. Either you're unsafe, or you fail
to optimize cases that users will complain about, or you write
and maintain a lot of code.
I'm also pretty displeased with the is-it-in-the-pg_catalog-schema
tests. Those are expensive, requiring additional catalog lookups,
and they prove just about nothing. There are extensions that shove
stuff into pg_catalog (look no further than contrib/adminpack), or
users could do it, so you really still are relying on the whole world
to get immutability right. If we're going to not trust immutability
markings on user-defined objects, I'd be inclined to do it by
checking that the object's OID is less than FirstGenbkiObjectId.
But maybe we should just forget that bit of paranoia and rely solely
on contain_mutable_functions(). That gets rid of the new maintenance
requirement, and I think arguably it's OK. An "immutable" function
whose result depends on changes that were just made by the calling
function is pretty obviously mislabeled, so people won't have much of
a leg to stand on if they complain about that. Pavel's argument
upthread that people sometimes intentionally mislabel functions as
immutable isn't really relevant: in his example, at least, the user
is intentionally trying to get the function to be evaluated early.
So whether it sees the effects of changes just made by the calling
function doesn't seem like it would matter to such a user.
regards, tom lane
On 11.11.2019 20:22, Tom Lane wrote:
None of those statements are true, in my experience.
In general, this patch seems like it's learned nothing from our
experiences with the late and unlamented exec_simple_check_node()
(cf commit 00418c612). Having a piece of plpgsql that has to know
about all possible "simple expression" node types is just a major
maintenance headache; but there is no short-cut solution that is
really going to be acceptable. Either you're unsafe, or you fail
to optimize cases that users will complain about, or you write
and maintain a lot of code.I'm also pretty displeased with the is-it-in-the-pg_catalog-schema
tests. Those are expensive, requiring additional catalog lookups,
and they prove just about nothing. There are extensions that shove
stuff into pg_catalog (look no further than contrib/adminpack), or
users could do it, so you really still are relying on the whole world
to get immutability right. If we're going to not trust immutability
markings on user-defined objects, I'd be inclined to do it by
checking that the object's OID is less than FirstGenbkiObjectId.But maybe we should just forget that bit of paranoia and rely solely
on contain_mutable_functions(). That gets rid of the new maintenance
requirement, and I think arguably it's OK. An "immutable" function
whose result depends on changes that were just made by the calling
function is pretty obviously mislabeled, so people won't have much of
a leg to stand on if they complain about that. Pavel's argument
upthread that people sometimes intentionally mislabel functions as
immutable isn't really relevant: in his example, at least, the user
is intentionally trying to get the function to be evaluated early.
So whether it sees the effects of changes just made by the calling
function doesn't seem like it would matter to such a user.regards, tom lane
In my opinion contain_mutable_functions() is the best solution.
But if it is not acceptable, I will rewrite the patch in white-list fashion.
I do not understand the argument about expensive
is-it-in-the-pg_catalog-schema test.
IsCatalogNameaspace is doing just simple comparison without any catalog
lookups:
bool
IsCatalogNamespace(Oid namespaceId)
{
return namespaceId == PG_CATALOG_NAMESPACE;
}
I may agree that it "proves nothing" if extension can put stuff in
pg_catalog.
I can replace it with comparison with FirstGenbkiObjectId.
But all this makes sense only if using contain_mutable_function() is not
acceptable.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
At Tue, 12 Nov 2019 11:27:24 +0300, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote in
On 11.11.2019 20:22, Tom Lane wrote:
None of those statements are true, in my experience.
In general, this patch seems like it's learned nothing from our
experiences with the late and unlamented exec_simple_check_node()
(cf commit 00418c612). Having a piece of plpgsql that has to know
about all possible "simple expression" node types is just a major
maintenance headache; but there is no short-cut solution that is
really going to be acceptable. Either you're unsafe, or you fail
to optimize cases that users will complain about, or you write
and maintain a lot of code.I'm also pretty displeased with the is-it-in-the-pg_catalog-schema
tests. Those are expensive, requiring additional catalog lookups,
and they prove just about nothing. There are extensions that shove
stuff into pg_catalog (look no further than contrib/adminpack), or
users could do it, so you really still are relying on the whole world
to get immutability right. If we're going to not trust immutability
markings on user-defined objects, I'd be inclined to do it by
checking that the object's OID is less than FirstGenbkiObjectId.But maybe we should just forget that bit of paranoia and rely solely
on contain_mutable_functions(). That gets rid of the new maintenance
requirement, and I think arguably it's OK. An "immutable" function
whose result depends on changes that were just made by the calling
function is pretty obviously mislabeled, so people won't have much of
a leg to stand on if they complain about that. Pavel's argument
upthread that people sometimes intentionally mislabel functions as
immutable isn't really relevant: in his example, at least, the user
is intentionally trying to get the function to be evaluated early.
So whether it sees the effects of changes just made by the calling
function doesn't seem like it would matter to such a user.regards, tom lane
In my opinion contain_mutable_functions() is the best solution.
But if it is not acceptable, I will rewrite the patch in white-list
fashion.
I agree for just relying on contain_mutable_functions for the same
reasons to Tom.
I do not understand the argument about expensive
is-it-in-the-pg_catalog-schema test.
IsCatalogNameaspace is doing just simple comparison without any
catalog lookups:bool
IsCatalogNamespace(Oid namespaceId)
{
return namespaceId == PG_CATALOG_NAMESPACE;
}I may agree that it "proves nothing" if extension can put stuff in
pg_catalog.
I can replace it with comparison with FirstGenbkiObjectId.
But all this makes sense only if using contain_mutable_function() is
not acceptable.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
At Tue, 12 Nov 2019 11:27:24 +0300, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote in
In my opinion contain_mutable_functions() is the best solution.
But if it is not acceptable, I will rewrite the patch in white-list
fashion.
I agree for just relying on contain_mutable_functions for the same
reasons to Tom.
I've set the CF entry to "Waiting on Author" pending a new patch
that does it like that.
I do not understand the argument about expensive
is-it-in-the-pg_catalog-schema test.
IsCatalogNameaspace is doing just simple comparison without any
catalog lookups:
As far as that goes, get_func_namespace() is the expensive part,
not IsCatalogNamespace(). If we were going to go down this path,
it'd perhaps be worthwhile to expand that and the adjacent
func_volatile() test into bulkier code that just does one syscache
fetch of the pg_proc row. But we're not, so it's moot.
regards, tom lane
I've set the CF entry to "Waiting on Author" pending a new patch
that does it like that.
With contain_mutable_functions the patch becomes trivial.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Attachments:
plpgsql_exec_expr-3.patchtext/x-patch; name=plpgsql_exec_expr-3.patchDownload
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index a4697dc..55ed878 100644
--- a/src/pl/plpgsql/src/pl_exec.c
+++ b/src/pl/plpgsql/src/pl_exec.c
@@ -30,6 +30,7 @@
#include "funcapi.h"
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
+#include "optimizer/clauses.h"
#include "optimizer/optimizer.h"
#include "parser/parse_coerce.h"
#include "parser/parse_type.h"
@@ -6157,7 +6158,7 @@ exec_eval_simple_expr(PLpgSQL_execstate *estate,
* updates made so far by our own function.
*/
oldcontext = MemoryContextSwitchTo(get_eval_mcontext(estate));
- if (!estate->readonly_func)
+ if (!estate->readonly_func && expr->expr_needs_snapshot)
{
CommandCounterIncrement();
PushActiveSnapshot(GetTransactionSnapshot());
@@ -6182,7 +6183,7 @@ exec_eval_simple_expr(PLpgSQL_execstate *estate,
estate->paramLI->parserSetupArg = save_setup_arg;
- if (!estate->readonly_func)
+ if (!estate->readonly_func && expr->expr_needs_snapshot)
PopActiveSnapshot();
MemoryContextSwitchTo(oldcontext);
@@ -8046,6 +8047,7 @@ exec_save_simple_expr(PLpgSQL_expr *expr, CachedPlan *cplan)
* current transaction".
*/
expr->expr_simple_expr = tle_expr;
+ expr->expr_needs_snapshot = contain_mutable_functions((Node*)tle_expr);
expr->expr_simple_generation = cplan->generation;
expr->expr_simple_state = NULL;
expr->expr_simple_in_use = false;
diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h
index f66b2ba..454131f 100644
--- a/src/pl/plpgsql/src/plpgsql.h
+++ b/src/pl/plpgsql/src/plpgsql.h
@@ -243,6 +243,7 @@ typedef struct PLpgSQL_expr
*/
ExprState *expr_simple_state; /* eval tree for expr_simple_expr */
bool expr_simple_in_use; /* true if eval tree is active */
+ bool expr_needs_snapshot; /* true if simple expression calls non-immutable functions or performs subqueries */
LocalTransactionId expr_simple_lxid;
} PLpgSQL_expr;
čt 21. 11. 2019 v 10:31 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
I've set the CF entry to "Waiting on Author" pending a new patch
that does it like that.With contain_mutable_functions the patch becomes trivial.
Stable functions doesn't need own snapshot too, so it is not fully correct,
but it is on safe side.
Pavel
Show quoted text
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Pavel Stehule <pavel.stehule@gmail.com> writes:
čt 21. 11. 2019 v 10:31 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:With contain_mutable_functions the patch becomes trivial.
Stable functions doesn't need own snapshot too, so it is not fully correct,
but it is on safe side.
No, I doubt that. A stable function is allowed to inspect database state,
and if it's being called by a volatile function, it has every right to
expect that it'd see updates-so-far made by the volatile function.
regards, tom lane
čt 21. 11. 2019 v 20:44 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
čt 21. 11. 2019 v 10:31 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:With contain_mutable_functions the patch becomes trivial.
Stable functions doesn't need own snapshot too, so it is not fully
correct,
but it is on safe side.
No, I doubt that. A stable function is allowed to inspect database state,
and if it's being called by a volatile function, it has every right to
expect that it'd see updates-so-far made by the volatile function.
for this I need new snapshot?
Show quoted text
regards, tom lane
At Fri, 22 Nov 2019 06:15:25 +0100, Pavel Stehule <pavel.stehule@gmail.com> wrote in
čt 21. 11. 2019 v 20:44 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
čt 21. 11. 2019 v 10:31 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:With contain_mutable_functions the patch becomes trivial.
Stable functions doesn't need own snapshot too, so it is not fully
correct,
but it is on safe side.
No, I doubt that. A stable function is allowed to inspect database state,
and if it's being called by a volatile function, it has every right to
expect that it'd see updates-so-far made by the volatile function.for this I need new snapshot?
It depends on what we regard as "query" or "command" here. It seems to
me that every line in a plpgsql function is regarded as a "query" for
stable function, even if the function is called in another "query".
In short, we need it, I think.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
pá 22. 11. 2019 v 7:33 odesílatel Kyotaro Horiguchi <horikyota.ntt@gmail.com>
napsal:
At Fri, 22 Nov 2019 06:15:25 +0100, Pavel Stehule <pavel.stehule@gmail.com>
wrote inčt 21. 11. 2019 v 20:44 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
čt 21. 11. 2019 v 10:31 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:With contain_mutable_functions the patch becomes trivial.
Stable functions doesn't need own snapshot too, so it is not fully
correct,
but it is on safe side.
No, I doubt that. A stable function is allowed to inspect database
state,
and if it's being called by a volatile function, it has every right to
expect that it'd see updates-so-far made by the volatile function.for this I need new snapshot?
It depends on what we regard as "query" or "command" here. It seems to
me that every line in a plpgsql function is regarded as a "query" for
stable function, even if the function is called in another "query".In short, we need it, I think.
ok, then the limits just to only immutable functions is wrong
I test it, and there is a problem already. We doesn't raise a exception,
but the result is wrong
create table foo(a int);
create or replace function f1(int)
returns void as $$
begin
insert into foo values($1);
end;
$$ language plpgsql;
create or replace function f2(int)
returns void as $$declare r record;
begin
perform f1(); for r in select * from foo loop raise notice '%', r; end
loop;
end;
$$ language plpgsql immutable; -- or stable with same behave
So current state is:
a) we allow to call volatile functions from nonvolatile (stable, immutable)
that really does write
b) but this change is not visible in parent nonvolatile functions. Is
visible only in volatile functions.
Is it expected behave?
So now, there are described issues already. And the limit to just immutable
function is not enough - these functions should be immutable buildin.
The core of these problems is based on function's flags related to planner
optimizations.
Maybe other flags WRITE | READ | PURE can help.
Now we don't know if volatile function writes to database or not - surely
random function doesn't do this. We can implement new set of flags, that
can reduce a overhead with snapshots.
The function random() can be VOLATILE PURE - and we will know, so result
of this function is not stable, but this function doesn't touch data engine.
When we don't have these flags, then the current logic is used, when we
have these flags, then it will be used. These flags can be more strict
we should not to allow any WRITE from READ function, or we should not allow
READ from PURE functions.
Notes, comments?
Pavel
This test should
Show quoted text
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
On 22.11.2019 10:08, Pavel Stehule wrote:
I test it, and there is a problem already. We doesn't raise a
exception, but the result is wrongcreate table foo(a int);
create or replace function f1(int)
returns void as $$
begin
insert into foo values($1);
end;
$$ language plpgsql;create or replace function f2(int)
returns void as $$declare r record;
begin
perform f1(); for r in select * from foo loop raise notice '%', r;
end loop;
end;
$$ language plpgsql immutable; -- or stable with same behaveSo current state is:
a) we allow to call volatile functions from nonvolatile (stable,
immutable) that really does write
b) but this change is not visible in parent nonvolatile functions. Is
visible only in volatile functions.Is it expected behave?
I think that in theory it is definitely not correct to call volatile
function from non-volatile.
But there are two questions:
1. Are we able to check it? Please taken in account that:
- at the moment of "create function f2()" called function f1() may
not yet be defined
- instead of perform f1() it can do "execute 'select f1()'" and it is
not possible to check it at compile time.
2. Is it responsibility of programmer to correctly specify function
properties or it should be done by compiler?
If we follow YWIYGI rule, then your definition of f2() is not correct
and that it is why you will get wrong result in this case.
If we what to completely rely on compiler, then... we do not not
volatile/immutable/stable/qualifiers at all! Compiler should deduce this
information itself.
But it will be non-trivial if ever possible, take in account 1)
In principle it is possible to add checks which will produce warning in
case of calling volatile function or executing dynamic SQL from
non-volatile function.
If such extra checking will be considered useful, I can propose patch
doing it.
But IMHO optimizer should rely on function qualifier provided by
programmer and it is acceptable to produce wrong result if this
information is not correct.
So now, there are described issues already. And the limit to just
immutable function is not enough - these functions should be immutable
buildin.The core of these problems is based on function's flags related to
planner optimizations.Maybe other flags WRITE | READ | PURE can help.
Now we don't know if volatile function writes to database or not -
surely random function doesn't do this. We can implement new set of
flags, that can reduce a overhead with snapshots.The function random() can be VOLATILE PURE - and we will know, so
result of this function is not stable, but this function doesn't touch
data engine.When we don't have these flags, then the current logic is used, when
we have these flags, then it will be used. These flags can be more strictwe should not to allow any WRITE from READ function, or we should not
allow READ from PURE functions.Notes, comments?
I think that even current model with "volatile", "immutable" and
"stable" is complex enough.
Adding more qualifiers will make it even more obscure and error-prone.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
pá 22. 11. 2019 v 8:32 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:
On 22.11.2019 10:08, Pavel Stehule wrote:
I test it, and there is a problem already. We doesn't raise a exception,
but the result is wrongcreate table foo(a int);
create or replace function f1(int)
returns void as $$
begin
insert into foo values($1);
end;
$$ language plpgsql;create or replace function f2(int)
returns void as $$declare r record;
begin
perform f1(); for r in select * from foo loop raise notice '%', r; end
loop;
end;
$$ language plpgsql immutable; -- or stable with same behaveSo current state is:
a) we allow to call volatile functions from nonvolatile (stable,
immutable) that really does write
b) but this change is not visible in parent nonvolatile functions. Is
visible only in volatile functions.Is it expected behave?
I think that in theory it is definitely not correct to call volatile
function from non-volatile.
But there are two questions:
1. Are we able to check it? Please taken in account that:
- at the moment of "create function f2()" called function f1() may not
yet be defined
- instead of perform f1() it can do "execute 'select f1()'" and it is not
possible to check it at compile time.
It's not possible to check it compile time now.
2. Is it responsibility of programmer to correctly specify function
properties or it should be done by compiler?
If we follow YWIYGI rule, then your definition of f2() is not correct
and that it is why you will get wrong result in this case.
maybe - a) but it is not documented, b) is not a postgresql's philosophy
return bad result - and c) it is not user friendly. There should be raised
error or result should be correct.
Theoretically this feature can be used for logging - you can write to log
table from immutable or stable function - so it can has some use case.
Probably if we implement autonomous transactions, then this behave should
be correct.
If we what to completely rely on compiler, then... we do not not
volatile/immutable/stable/qualifiers at all! Compiler should deduce this
information itself.
But it will be non-trivial if ever possible, take in account 1)
I am able to do it in plpgsql_check for plpgsql. But probably it is not
possible do it for PLPerl, PLPythonu, ..
In principle it is possible to add checks which will produce warning in
case of calling volatile function or executing dynamic SQL from
non-volatile function.
If such extra checking will be considered useful, I can propose patch
doing it.
But IMHO optimizer should rely on function qualifier provided by
programmer and it is acceptable to produce wrong result if this information
is not correct.
We should to distinguish between bad result and not well optimized plan.
So now, there are described issues already. And the limit to just
immutable function is not enough - these functions should be immutable
buildin.The core of these problems is based on function's flags related to planner
optimizations.Maybe other flags WRITE | READ | PURE can help.
Now we don't know if volatile function writes to database or not - surely
random function doesn't do this. We can implement new set of flags, that
can reduce a overhead with snapshots.The function random() can be VOLATILE PURE - and we will know, so result
of this function is not stable, but this function doesn't touch data engine.When we don't have these flags, then the current logic is used, when we
have these flags, then it will be used. These flags can be more strictwe should not to allow any WRITE from READ function, or we should not
allow READ from PURE functions.Notes, comments?
I think that even current model with "volatile", "immutable" and "stable"
is complex enough.
Adding more qualifiers will make it even more obscure and error-prone.
I don't think - the classic example is random() function. It's volatile,
but you don't need special snapshot for calling this function.
Still any change of behave can breaks lot of applications, because how we
can see, Postgres is very tolerant now.
Show quoted text
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 22.11.2019 11:05, Pavel Stehule wrote:
We should to distinguish between bad result and not well optimized plan.
If it is not possible to implement runtime check tha timmutable function
is not making any changes in database.
Please notice, that even right now without any get snapshot optimization
Postgres can produce incorrect result in case of incorrectly specidied
immutable or stable qualifiers.
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
At Fri, 22 Nov 2019 08:08:24 +0100, Pavel Stehule <pavel.stehule@gmail.com> wrote in
pá 22. 11. 2019 v 7:33 odesílatel Kyotaro Horiguchi <horikyota.ntt@gmail.com>
napsal:At Fri, 22 Nov 2019 06:15:25 +0100, Pavel Stehule <pavel.stehule@gmail.com>
wrote inčt 21. 11. 2019 v 20:44 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
čt 21. 11. 2019 v 10:31 odesílatel Konstantin Knizhnik <
k.knizhnik@postgrespro.ru> napsal:With contain_mutable_functions the patch becomes trivial.
Stable functions doesn't need own snapshot too, so it is not fully
correct,
but it is on safe side.
No, I doubt that. A stable function is allowed to inspect database
state,
and if it's being called by a volatile function, it has every right to
expect that it'd see updates-so-far made by the volatile function.for this I need new snapshot?
It depends on what we regard as "query" or "command" here. It seems to
me that every line in a plpgsql function is regarded as a "query" for
stable function, even if the function is called in another "query".In short, we need it, I think.
ok, then the limits just to only immutable functions is wrong
I test it, and there is a problem already. We doesn't raise a exception,
but the result is wrongcreate table foo(a int);
create or replace function f1(int)
returns void as $$
begin
insert into foo values($1);
end;
$$ language plpgsql;create or replace function f2(int)
returns void as $$declare r record;
begin
perform f1(); for r in select * from foo loop raise notice '%', r; end
loop;
end;
$$ language plpgsql immutable; -- or stable with same behaveSo current state is:
a) we allow to call volatile functions from nonvolatile (stable, immutable)
that really does write
b) but this change is not visible in parent nonvolatile functions. Is
visible only in volatile functions.Is it expected behave?
I'm not sure, volatility is the total bahavior of the function,
regardless of whatever the function does internally. Even though I'm
not sure how to use volatile functions in non-volatile functions, I
don't see direct reason or how to inhibit that and I think we don't
even need to bother that. It's owe to the definier of the function.
So now, there are described issues already. And the limit to just immutable
function is not enough - these functions should be immutable buildin.The core of these problems is based on function's flags related to planner
optimizations.Maybe other flags WRITE | READ | PURE can help.
Now we don't know if volatile function writes to database or not - surely
random function doesn't do this. We can implement new set of flags, that
can reduce a overhead with snapshots.The function random() can be VOLATILE PURE - and we will know, so result
of this function is not stable, but this function doesn't touch data engine.When we don't have these flags, then the current logic is used, when we
have these flags, then it will be used. These flags can be more strictwe should not to allow any WRITE from READ function, or we should not allow
READ from PURE functions.Notes, comments?
Pavel
I think such fine-grained categorization is beyond our maintainance
capability and users still much cannot follow. I recall of an extreme
discussion that the result of immutable system functions still can
mutate beyond major version upgrade. Volatility is mere a hint and,
perhaps as Robert said somewhere, the baseline we should do is to
avoid crash for any possible configuration.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
At Fri, 22 Nov 2019 08:08:24 +0100, Pavel Stehule <pavel.stehule@gmail.com> wrote in
a) we allow to call volatile functions from nonvolatile (stable, immutable)
that really does write
b) but this change is not visible in parent nonvolatile functions. Is
visible only in volatile functions.Is it expected behave?
I'm not sure, volatility is the total bahavior of the function,
regardless of whatever the function does internally. Even though I'm
not sure how to use volatile functions in non-volatile functions, I
don't see direct reason or how to inhibit that and I think we don't
even need to bother that. It's owe to the definier of the function.
I'm pretty sure we had this discussion long ago when we implemented
the existing read-only-function restrictions in plpgsql. Yeah, in
principle an immutable function should refuse to call non-immutable
functions, but the practical costs and benefits of doing that aren't
very attractive. The existing rules are inexpensive to enforce and
catch most mistakes in this area. Catching the other few percent of
mistakes would require a really significant change, not only on our
part but also users' parts --- for example, forgetting to label a
function as immutable when it could be might break your application
entirely.
I made some cosmetic fixes in the proposed patch and pushed it.
regards, tom lane