SQLFunctionCache and generic plans
Hello,
It has been brought to my attention that SQL functions always use generic
plans.
Take this function for example:
create or replace function test_plpgsql(p1 oid) returns text as $$
BEGIN
RETURN (SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT 1);
END;
$$ language plpgsql;
As expected, the PlanCache takes care of generating parameter specific plans,
and correctly prunes the redundant OR depending on wether we call the function
with a NULL value or not:
ro=# select test_plpgsql(NULL);
LOG: duration: 0.030 ms plan:
Query Text: (SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT
1)
Result (cost=0.04..0.05 rows=1 width=64)
InitPlan 1 (returns $0)
-> Limit (cost=0.00..0.04 rows=1 width=64)
-> Seq Scan on pg_class (cost=0.00..18.12 rows=412 width=64)
LOG: duration: 0.662 ms plan:
Query Text: select test_plpgsql(NULL);
Result (cost=0.00..0.26 rows=1 width=32)
ro=# select test_plpgsql(1);
LOG: duration: 0.075 ms plan:
Query Text: (SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT
1)
Result (cost=8.29..8.30 rows=1 width=64)
InitPlan 1 (returns $0)
-> Limit (cost=0.27..8.29 rows=1 width=64)
-> Index Scan using pg_class_oid_index on pg_class
(cost=0.27..8.29 rows=1 width=64)
Index Cond: (oid = '1'::oid)
LOG: duration: 0.675 ms plan:
Query Text: select test_plpgsql(1);
Result (cost=0.00..0.26 rows=1 width=32)
But writing the same function in SQL:
create or replace function test_sql(p1 oid) returns text as $$
SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT 1
$$ language sql;
we end up with a generic plan:
ro=# select test_sql(1);
LOG: duration: 0.287 ms plan:
Query Text: SELECT relname FROM pg_class WHERE oid = p1 OR p1 IS NULL LIMIT 1
Query Parameters: $1 = '1'
Limit (cost=0.00..6.39 rows=1 width=32)
-> Seq Scan on pg_class (cost=0.00..19.16 rows=3 width=32)
Filter: ((oid = $1) OR ($1 IS NULL))
This is due to the fact that SQL functions are planned once for the whole
query using a specific SQLFunctionCache instead of using the whole PlanCache
machinery.
The following comment can be found in functions.c, about the SQLFunctionCache:
* Note that currently this has only the lifespan of the calling query.
* Someday we should rewrite this code to use plancache.c to save parse/plan
* results for longer than that.
I would be interested in working on this, primarily to avoid this problem of
having generic query plans for SQL functions but maybe having a longer lived
cache as well would be nice to have.
Is there any reason not too, or pitfalls we would like to avoid ?
Best regards,
--
Ronan Dunklau
Ronan Dunklau <ronan.dunklau@aiven.io> writes:
The following comment can be found in functions.c, about the SQLFunctionCache:
* Note that currently this has only the lifespan of the calling query.
* Someday we should rewrite this code to use plancache.c to save parse/plan
* results for longer than that.
I would be interested in working on this, primarily to avoid this problem of
having generic query plans for SQL functions but maybe having a longer lived
cache as well would be nice to have.
Is there any reason not too, or pitfalls we would like to avoid ?
AFAIR it's just lack of round tuits. There would probably be some
semantic side-effects, though if you pay attention you could likely
make things better while you are at it. The existing behavior of
parsing and planning all the statements at once is not very desirable
--- for instance, it doesn't work to do
CREATE TABLE foo AS ...;
SELECT * FROM foo;
I think if we're going to nuke this code and start over, we should
try to make that sort of case work.
regards, tom lane
Tom Lane писал(а) 2023-02-07 18:29:
Ronan Dunklau <ronan.dunklau@aiven.io> writes:
The following comment can be found in functions.c, about the
SQLFunctionCache:* Note that currently this has only the lifespan of the calling
query.
* Someday we should rewrite this code to use plancache.c to save
parse/plan
* results for longer than that.I would be interested in working on this, primarily to avoid this
problem of
having generic query plans for SQL functions but maybe having a longer
lived
cache as well would be nice to have.
Is there any reason not too, or pitfalls we would like to avoid ?AFAIR it's just lack of round tuits. There would probably be some semantic side-effects, though if you pay attention you could likely make things better while you are at it. The existing behavior of parsing and planning all the statements at once is not very desirable --- for instance, it doesn't work to do CREATE TABLE foo AS ...; SELECT * FROM foo; I think if we're going to nuke this code and start over, we should try to make that sort of case work.regards, tom lane
Hi.
I've tried to make SQL functions use CachedPlan machinery. The main goal
was to allow SQL functions to use custom plans
(the work was started from question - why sql function is so slow
compared to plpgsql one). It turned out that
plpgsql function used custom plan and eliminated scan of all irrelevant
sections, but
exec-time pruning didn't cope with pruning when ScalarArrayOpExpr,
filtering data using int[] parameter.
In current prototype there are two restrictions. The first one is that
CachecPlan has lifetime of a query - it's not
saved for future use, as we don't have something like plpgsql hashtable
for long live function storage. Second -
SQL language functions in sql_body form (with stored queryTree_list) are
handled in the old way, as we currently lack
tools to make cached plans from query trees.
Currently this change solves the issue of inefficient plans for queries
over partitioned tables. For example, function like
CREATE OR REPLACE FUNCTION public.test_get_records(ids integer[])
RETURNS SETOF test
LANGUAGE sql
AS $function$
select *
from test
where id = any (ids)
$function$;
for hash-distributed table test can perform pruning in plan time and
can have plan like
Append (cost=0.00..51.88 rows=26 width=36)
-> Seq Scan on test_0 test_1 (cost=0.00..25.88 rows=13
width=36)
Filter: (id = ANY ('{1,2}'::integer[]))
-> Seq Scan on test_2 (cost=0.00..25.88 rows=13 width=36)
Filter: (id = ANY ('{1,2}'::integer[]))
instead of
Append (cost=0.00..155.54 rows=248 width=36)
-> Seq Scan on test_0 test_1 (cost=0.00..38.58 rows=62
width=36)
Filter: (id = ANY ($1))
-> Seq Scan on test_1 test_2 (cost=0.00..38.58 rows=62
width=36)
Filter: (id = ANY ($1))
-> Seq Scan on test_2 test_3 (cost=0.00..38.58 rows=62
width=36)
Filter: (id = ANY ($1))
-> Seq Scan on test_3 test_4 (cost=0.00..38.58 rows=62
width=36)
Filter: (id = ANY ($1))
This patch definitely requires more work, and I share it to get some
early feedback.
What should we do with "pre-parsed" SQL functions (when prosrc is
empty)? How should we create cached plans when we don't have raw
parsetrees?
Currently we can create cached plans without raw parsetrees, but this
means that plan revalidation doesn't work, choose_custom_plan()
always returns false and we get generic plan. Perhaps, we need some form
of GetCachedPlan(), which ignores raw_parse_tree?
In this case how could we possibly cache plans for session lifetime
(like plpgsql language does) if we can't use cached revalidation
machinery?
I hope to get some hints to move further.
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Attachments:
0001-Use-custom-plan-machinery-for-SQL-functions.patchtext/x-diff; name=0001-Use-custom-plan-machinery-for-SQL-functions.patchDownload+265-71
Hi, Alexander!
On Tue, Sep 3, 2024 at 10:33 AM Alexander Pyhalov
<a.pyhalov@postgrespro.ru> wrote:
Tom Lane писал(а) 2023-02-07 18:29:
Ronan Dunklau <ronan.dunklau@aiven.io> writes:
The following comment can be found in functions.c, about the
SQLFunctionCache:* Note that currently this has only the lifespan of the calling
query.
* Someday we should rewrite this code to use plancache.c to save
parse/plan
* results for longer than that.I would be interested in working on this, primarily to avoid this
problem of
having generic query plans for SQL functions but maybe having a longer
lived
cache as well would be nice to have.
Is there any reason not too, or pitfalls we would like to avoid ?AFAIR it's just lack of round tuits. There would probably be some semantic side-effects, though if you pay attention you could likely make things better while you are at it. The existing behavior of parsing and planning all the statements at once is not very desirable --- for instance, it doesn't work to do CREATE TABLE foo AS ...; SELECT * FROM foo; I think if we're going to nuke this code and start over, we should try to make that sort of case work.regards, tom lane
Hi.
I've tried to make SQL functions use CachedPlan machinery. The main goal
was to allow SQL functions to use custom plans
(the work was started from question - why sql function is so slow
compared to plpgsql one). It turned out that
plpgsql function used custom plan and eliminated scan of all irrelevant
sections, but
exec-time pruning didn't cope with pruning when ScalarArrayOpExpr,
filtering data using int[] parameter.In current prototype there are two restrictions. The first one is that
CachecPlan has lifetime of a query - it's not
saved for future use, as we don't have something like plpgsql hashtable
for long live function storage. Second -
SQL language functions in sql_body form (with stored queryTree_list) are
handled in the old way, as we currently lack
tools to make cached plans from query trees.Currently this change solves the issue of inefficient plans for queries
over partitioned tables. For example, function likeCREATE OR REPLACE FUNCTION public.test_get_records(ids integer[])
RETURNS SETOF test
LANGUAGE sql
AS $function$
select *
from test
where id = any (ids)
$function$;for hash-distributed table test can perform pruning in plan time and
can have plan likeAppend (cost=0.00..51.88 rows=26 width=36)
-> Seq Scan on test_0 test_1 (cost=0.00..25.88 rows=13
width=36)
Filter: (id = ANY ('{1,2}'::integer[]))
-> Seq Scan on test_2 (cost=0.00..25.88 rows=13 width=36)
Filter: (id = ANY ('{1,2}'::integer[]))instead of
Append (cost=0.00..155.54 rows=248 width=36)
-> Seq Scan on test_0 test_1 (cost=0.00..38.58 rows=62
width=36)
Filter: (id = ANY ($1))
-> Seq Scan on test_1 test_2 (cost=0.00..38.58 rows=62
width=36)
Filter: (id = ANY ($1))
-> Seq Scan on test_2 test_3 (cost=0.00..38.58 rows=62
width=36)
Filter: (id = ANY ($1))
-> Seq Scan on test_3 test_4 (cost=0.00..38.58 rows=62
width=36)
Filter: (id = ANY ($1))This patch definitely requires more work, and I share it to get some
early feedback.What should we do with "pre-parsed" SQL functions (when prosrc is
empty)? How should we create cached plans when we don't have raw
parsetrees?
Currently we can create cached plans without raw parsetrees, but this
means that plan revalidation doesn't work, choose_custom_plan()
always returns false and we get generic plan. Perhaps, we need some form
of GetCachedPlan(), which ignores raw_parse_tree?
I don't think you need a new form of GetCachedPlan(). Instead, it
seems that StmtPlanRequiresRevalidation() should be revised. As I got
from comments and the d8b2fcc9d4 commit message, the primary goal was
to skip revalidation of utility statements. Skipping revalidation was
a positive side effect, as long as we didn't support custom plans for
them anyway. But as you're going to change this,
StmtPlanRequiresRevalidation() needs to be revised.
I also think it's not necessary to implement long-lived plan cache in
the initial patch. The work could be split into two patches. The
first could implement query lifetime plan cache. This is beneficial
already by itself as you've shown by example. The second could
implement long-lived plan cache.
I appreciate your work in this direction. I hope you got the feedback
to go ahead and work on remaining issues.
------
Regards,
Alexander Korotkov
Supabase
Hi.
What should we do with "pre-parsed" SQL functions (when prosrc is
empty)? How should we create cached plans when we don't have raw
parsetrees?
Currently we can create cached plans without raw parsetrees, but this
means that plan revalidation doesn't work, choose_custom_plan()
always returns false and we get generic plan. Perhaps, we need some
form
of GetCachedPlan(), which ignores raw_parse_tree?I don't think you need a new form of GetCachedPlan(). Instead, it
seems that StmtPlanRequiresRevalidation() should be revised. As I got
from comments and the d8b2fcc9d4 commit message, the primary goal was
to skip revalidation of utility statements. Skipping revalidation was
a positive side effect, as long as we didn't support custom plans for
them anyway. But as you're going to change this,
StmtPlanRequiresRevalidation() needs to be revised.
Thanks for feedback.
I've modifed StmtPlanRequiresRevalidation() so that it looks on queries
command type.
Not sure if it's enough or I have to recreate something more similar to
stmt_requires_parse_analysis()
logic. I was wondering if this can lead to triggering plan revalidation
in RevalidateCachedQuery().
I suppose not - as we plan in executor (so shouldn't catch userid change
or see some changes in
related objects. Revalidation would kill our plan, destroying
resultDesc.
Also while looking at this, fixed processing of instead of rules (which
would lead to NULL execution_state).
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Attachments:
v2-0001-Use-custom-plan-machinery-for-SQL-functions.patchtext/x-diff; name=v2-0001-Use-custom-plan-machinery-for-SQL-functions.patchDownload+385-86
Hi
út 31. 12. 2024 v 16:36 odesílatel Alexander Pyhalov <
a.pyhalov@postgrespro.ru> napsal:
Hi.
What should we do with "pre-parsed" SQL functions (when prosrc is
empty)? How should we create cached plans when we don't have raw
parsetrees?
Currently we can create cached plans without raw parsetrees, but this
means that plan revalidation doesn't work, choose_custom_plan()
always returns false and we get generic plan. Perhaps, we need some
form
of GetCachedPlan(), which ignores raw_parse_tree?I don't think you need a new form of GetCachedPlan(). Instead, it
seems that StmtPlanRequiresRevalidation() should be revised. As I got
from comments and the d8b2fcc9d4 commit message, the primary goal was
to skip revalidation of utility statements. Skipping revalidation was
a positive side effect, as long as we didn't support custom plans for
them anyway. But as you're going to change this,
StmtPlanRequiresRevalidation() needs to be revised.Thanks for feedback.
I've modifed StmtPlanRequiresRevalidation() so that it looks on queries
command type.
Not sure if it's enough or I have to recreate something more similar to
stmt_requires_parse_analysis()
logic. I was wondering if this can lead to triggering plan revalidation
in RevalidateCachedQuery().
I suppose not - as we plan in executor (so shouldn't catch userid change
or see some changes in
related objects. Revalidation would kill our plan, destroying
resultDesc.Also while looking at this, fixed processing of instead of rules (which
would lead to NULL execution_state).
--
there are lot of fails found by tester
Please, can you check it?
regards
Pavel
Show quoted text
Best regards,
Alexander Pyhalov,
Postgres Professional
Pavel Stehule писал(а) 2024-12-31 18:39:
Hi
út 31. 12. 2024 v 16:36 odesílatel Alexander Pyhalov
<a.pyhalov@postgrespro.ru> napsal:Hi.
What should we do with "pre-parsed" SQL functions (when prosrc is
empty)? How should we create cached plans when we don't have raw
parsetrees?
Currently we can create cached plans without raw parsetrees, butthis
means that plan revalidation doesn't work, choose_custom_plan()
always returns false and we get generic plan. Perhaps, we needsome
form
of GetCachedPlan(), which ignores raw_parse_tree?I don't think you need a new form of GetCachedPlan(). Instead, it
seems that StmtPlanRequiresRevalidation() should be revised. As Igot
from comments and the d8b2fcc9d4 commit message, the primary goal
was
to skip revalidation of utility statements. Skipping revalidation
was
a positive side effect, as long as we didn't support custom plans
for
them anyway. But as you're going to change this,
StmtPlanRequiresRevalidation() needs to be revised.Thanks for feedback.
I've modifed StmtPlanRequiresRevalidation() so that it looks on
queries
command type.
Not sure if it's enough or I have to recreate something more similar
to
stmt_requires_parse_analysis()
logic. I was wondering if this can lead to triggering plan
revalidation
in RevalidateCachedQuery().
I suppose not - as we plan in executor (so shouldn't catch userid
change
or see some changes in
related objects. Revalidation would kill our plan, destroying
resultDesc.Also while looking at this, fixed processing of instead of rules
(which
would lead to NULL execution_state).
--there are lot of fails found by tester
Please, can you check it?
Hi. Sorry for late response - we had holidays here and later there was
some urgent work from 2024 :)
Do you speak about failures on check-world?
It seems
commit a8ccf4e93a7eeaae66007bbf78cf9183ceb1b371
Author: Richard Guo <rguo@postgresql.org>
Date: Tue Nov 26 09:25:18 2024 +0900
Reordering DISTINCT keys to match input path's pathkeys
added new GUC enable_distinct_reordering and this caused test failures.
I've rebased patch on master. Tests pass here.
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Attachments:
v3-0001-Use-custom-plan-machinery-for-SQL-functions.patchtext/x-diff; name=v3-0001-Use-custom-plan-machinery-for-SQL-functions.patchDownload+385-86
Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes:
I've rebased patch on master. Tests pass here.
The cfbot still doesn't like it; my guess is that you built without
--with-libxml and so didn't notice the effects on xml.out.
I've looked through the patch briefly and have a few thoughts:
* You cannot use plancache.c like this:
plansource = CreateCachedPlan(NULL, fcache->src, CreateCommandTag((Node *)parsetree));
CreateCachedPlan supposes that raw_parse_tree == NULL means an empty
query. Now aside from the fact that you didn't update the comment
documenting that, this won't work, because RevalidateCachedQuery
also knows that raw_parse_tree == NULL means an empty query. If an
invalidation happens while the function is running, revalidation
will produce an empty plan list, causing failure. (It seems like
a regression test case exercising such invalidation is called for.)
I think the only way you can really make this work is to have two
kinds of cached plans: the current kind where the starting point
is a RawStmt, and a new kind where the starting point is an
analyzed-but-not-rewritten Query tree (or maybe a list of those,
but if you can hold it to one tree things will be simpler).
I'd be inclined to mechanize that by adding a field named something
like "Query *analyzed_parse_tree" to CachedPlanSource and inventing a
new creation function that parallels CreateCachedPlan but takes an
analyzed Query instead of a RawStmt. Then you'll need to fix
RevalidateCachedQuery so that if analyzed_parse_tree isn't NULL then
it ignores raw_parse_tree and proceeds straight to rewriting the
already-analyzed Query. You'll need to check all the other places
that touch CachedPlanSource.raw_parse_tree, but I think all the other
required updates are pretty obvious. (Do not omit updating the
comments in plancache.h and plancache.c that describe all this.)
* I'm not very convinced by the new logic in
StmtPlanRequiresRevalidation. Is it really the case that "check for
CMD_UTILITY" is sufficient? Even if the coding is okay, the comment
needs work to provide a better explanation why it's okay.
* I'd say lose the enable_sql_func_custom_plans GUC. We don't have
anything comparable for plpgsql and there's been (relatively) little
complaint about that. Aside from adding clutter to the patch, it
contorts the logic in functions.c because it forces you to support
two code paths.
* The regression output changes like
-CONTEXT: SQL function "regexp_match" statement 1
+CONTEXT: SQL function "regexp_match" during startup
seem fairly unfortunate. Aside from making maintenance of the patch
painful, the changed messages provide strictly less information to the
user. I would try hard to eliminate that behavioral change. I think
you could do so, or at least greatly reduce the number of diffs, by
having init_sql_fcache perform only raw parsing (producing a list of
RawStmts, or extracting a list of Queries from prosqlbody) and
delaying creation of a plan for a particular statement until you first
reach execution of that statement. This would be a better way all
around because it'd also fix the anomalies I mentioned upthread for
cases where a DDL statement earlier in the function should affect
parse analysis of a later statement.
* release_plans seems kind of wrong: isn't it giving up most of
the benefit of doing this? plpgsql doesn't seem to release plans
unless it's forced to revalidate. Why should SQL functions
behave differently? Also, I'm pretty sure it's wrong that
you have
+ ReleaseCachedPlan(cplan, NULL);
with no ResourceOwner tracking the reference. That probably
means that the code leaks cached plans on error. For the
current patch iteration with the function's data structure only
meant to live as long as the query, it should be sufficient to
use CurrentResourceOwner to manage these cplan references.
* The patch is hard to read in functions.c because there's a mix of
code-motion and actual logic changes that are touching the same
places. Perhaps it'd be worth splitting it into a series of two
patches: the first one just does code motion such as pushing existing
logic into a new subroutine, and then the second one has the logic
changes. Or maybe that wouldn't help much, but consider giving it a
try.
regards, tom lane
Tom Lane писал(а) 2025-01-17 21:27:
Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes:
I've rebased patch on master. Tests pass here.
The cfbot still doesn't like it; my guess is that you built without
--with-libxml and so didn't notice the effects on xml.out.
Hi. Thank you for review.
I've updated patch.
I've looked through the patch briefly and have a few thoughts:
* You cannot use plancache.c like this:
plansource = CreateCachedPlan(NULL, fcache->src,
CreateCommandTag((Node *)parsetree));CreateCachedPlan supposes that raw_parse_tree == NULL means an empty
query. Now aside from the fact that you didn't update the comment
documenting that, this won't work, because RevalidateCachedQuery
also knows that raw_parse_tree == NULL means an empty query. If an
invalidation happens while the function is running, revalidation
will produce an empty plan list, causing failure. (It seems like
a regression test case exercising such invalidation is called for.)
Yes, yes... I've spent some time, trying to reproduce it or to prove
that
plan revalidation is impossible (as I considered). I was wrong, plan
revaldiation is possible, and I've added some test cases. First of all,
we don't save our plans to CacheMemoryContext, so usual invalidation
callbacks don't touch our cached plans. However, I've managed
to reproduce it by changing row_security GUC in SQL function. So,
after the first round of creating plan and running them, on second call
we can see that plansource->rewriteRowSecurity mismatches row_security
and invalidate the plan. I've added Query to plansource, and rewrite it
in RevalidateCachedQuery(), as you suggested.
Interesting side effect - earlier if row_security was altered in sql
function,
this didn't have impact on further function execution (generic plan was
already built).
Now, if revalidation for generic plan is triggered, query ends up with
"ERROR: query would be affected by row-level security policy for
table".
* I'm not very convinced by the new logic in
StmtPlanRequiresRevalidation. Is it really the case that "check for
CMD_UTILITY" is sufficient? Even if the coding is okay, the comment
needs work to provide a better explanation why it's okay.
Updated logic to match stmt_requires_parse_analysis().
* I'd say lose the enable_sql_func_custom_plans GUC. We don't have
anything comparable for plpgsql and there's been (relatively) little
complaint about that. Aside from adding clutter to the patch, it
contorts the logic in functions.c because it forces you to support
two code paths.
Removed it.
* The regression output changes like
-CONTEXT: SQL function "regexp_match" statement 1 +CONTEXT: SQL function "regexp_match" during startupseem fairly unfortunate. Aside from making maintenance of the patch
painful, the changed messages provide strictly less information to the
user. I would try hard to eliminate that behavioral change. I think
you could do so, or at least greatly reduce the number of diffs, by
having init_sql_fcache perform only raw parsing (producing a list of
RawStmts, or extracting a list of Queries from prosqlbody) and
delaying creation of a plan for a particular statement until you first
reach execution of that statement. This would be a better way all
around because it'd also fix the anomalies I mentioned upthread for
cases where a DDL statement earlier in the function should affect
parse analysis of a later statement.
Yes, I've seen this remark. However, was a bit frightened to touch
all guts of functions.c processing. So, instead I've just recorded
a statement number of the currently-planning query and use it in
error messages. If you insist, I can try rewriting current
first-plan-all-then-run approach, but without good tests for
sql functions this looks error-prone.
* release_plans seems kind of wrong: isn't it giving up most of
the benefit of doing this? plpgsql doesn't seem to release plans
unless it's forced to revalidate. Why should SQL functions
behave differently?
The issue is that we potentially create custom plan for each set
of sql function arguments. And in queries like
SELECT f(field) FROM T1
this can consume much memory. The good thing here is that we don't care
much about it if execution switches to generic plans -
after some rows are selected and some custom plans are built.
Of course, there are bad cases. For example, we've seen that
some simple functions like
create function test(n int) returns text as $$
select string_agg(i::text, '') FROM generate_series(1,n) i
$$
language sql;
always use custom plan for small n numbers - it's the same, but cheaper
(as we can't predict correct row number without knowing function
arguments):
Aggregate (cost=0.15..0.16 rows=1 width=32)
-> Function Scan on generate_series i (cost=0.00..0.08 rows=8
width=4)
versus
Aggregate (cost=17.50..17.52 rows=1 width=32)
-> Function Scan on generate_series i (cost=0.00..10.00
rows=1000 width=4)
So if it appears somewhere deep in data generation scenarios, planning
time
can sufficiently rise (earlier generic plan was always used).
Turning back to release_plans(), it's needed, because there's
possibility
that for each set of arguments we create new custom plan. It doesn't
affect
generic plan, as plansource itself holds reference to it.
Also, I'm pretty sure it's wrong that
you have
+ ReleaseCachedPlan(cplan, NULL);
with no ResourceOwner tracking the reference. That probably
means that the code leaks cached plans on error. For the
current patch iteration with the function's data structure only
meant to live as long as the query, it should be sufficient to
use CurrentResourceOwner to manage these cplan references.
Cached plans are not stored in long-lived context for now. So we can't
use CurrentResourceOwner,
ReleaseCachedPlan(cplan, CurrentResourceOwner) will die on
"Assert(plan->is_saved)".
* The patch is hard to read in functions.c because there's a mix of
code-motion and actual logic changes that are touching the same
places. Perhaps it'd be worth splitting it into a series of two
patches: the first one just does code motion such as pushing existing
logic into a new subroutine, and then the second one has the logic
changes. Or maybe that wouldn't help much, but consider giving it a
try.
Moved splitting of check_planned_stmt() into separate patch.
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Attachments:
0002-Use-custom-plan-machinery-for-SQL-function.patchtext/x-diff; name=0002-Use-custom-plan-machinery-for-SQL-function.patchDownload+398-50
0001-Split-out-SQL-functions-checks-from-init_execution_s.patchtext/x-diff; name=0001-Split-out-SQL-functions-checks-from-init_execution_s.patchDownload+36-27
Alexander Pyhalov писал(а) 2025-01-29 17:35:
Tom Lane писал(а) 2025-01-17 21:27:
Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes:
I've rebased patch on master. Tests pass here.
The cfbot still doesn't like it; my guess is that you built without
--with-libxml and so didn't notice the effects on xml.out.Hi. Thank you for review.
I've updated patch.
Sorry, missed one local patch to fix memory bloat during replaning. Also
fixed a compiler warning.
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Attachments:
v5-0002-Use-custom-plan-machinery-for-SQL-function.patchtext/x-diff; name=v5-0002-Use-custom-plan-machinery-for-SQL-function.patchDownload+402-50
v5-0001-Split-out-SQL-functions-checks-from-init_execution_s.patchtext/x-diff; name=v5-0001-Split-out-SQL-functions-checks-from-init_execution_s.patchDownload+36-27
Hi
čt 30. 1. 2025 v 9:50 odesílatel Alexander Pyhalov <a.pyhalov@postgrespro.ru>
napsal:
Alexander Pyhalov писал(а) 2025-01-29 17:35:
Tom Lane писал(а) 2025-01-17 21:27:
Alexander Pyhalov <a.pyhalov@postgrespro.ru> writes:
I've rebased patch on master. Tests pass here.
The cfbot still doesn't like it; my guess is that you built without
--with-libxml and so didn't notice the effects on xml.out.Hi. Thank you for review.
I've updated patch.
Sorry, missed one local patch to fix memory bloat during replaning. Also
fixed a compiler warning.
Did you do some performance checks?
I tried some worst case
CREATE OR REPLACE FUNCTION fx(int)
RETURNS int AS $$
SELECT $1 + $1
$$ LANGUAGE SQL IMMUTABLE;
CREATE OR REPLACE FUNCTION fx2(int)
RETURNS int AS $$
SELECT $1 * 2
$$ LANGUAGE SQL IMMUTABLE;
do $$
begin
for i in 1..1000000 loop
perform fx(i); -- or fx2
end loop;
end;
$$;
DO
The patched version reduces the difference between execution fx and fx2,
but patched version is about 10% slower than unpatched.
The overhead of plan cache looks significant for simple cases (and a lot of
SQL functions are very simple).
Regards
Pavel
Show quoted text
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Pavel Stehule <pavel.stehule@gmail.com> writes:
Did you do some performance checks?
This is a good question to ask ...
I tried some worst case
CREATE OR REPLACE FUNCTION fx(int)
RETURNS int AS $$
SELECT $1 + $1
$$ LANGUAGE SQL IMMUTABLE;
... but I don't think tests like this will give helpful answers.
That function is simple enough to be inlined:
regression=# explain verbose select fx(f1) from int4_tbl;
QUERY PLAN
---------------------------------------------------------------
Seq Scan on public.int4_tbl (cost=0.00..1.06 rows=5 width=4)
Output: (f1 + f1)
(2 rows)
So functions.c shouldn't have any involvement at all in the
actually-executed PERFORM expression, and whatever difference
you measured must have been noise. (If the effect *is* real,
we'd better find out why.)
You need to test with a non-inline-able function. Looking
at the inlining conditions in inline_function(), one simple
hack is to make the function return SETOF. That'll only
exercise the returns-set path in functions.c though, so it'd
be advisable to check other inline-blocking conditions too.
regards, tom lane
po 3. 2. 2025 v 17:00 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:
Did you do some performance checks?
This is a good question to ask ...
I tried some worst case
CREATE OR REPLACE FUNCTION fx(int)
RETURNS int AS $$
SELECT $1 + $1
$$ LANGUAGE SQL IMMUTABLE;... but I don't think tests like this will give helpful answers.
That function is simple enough to be inlined:regression=# explain verbose select fx(f1) from int4_tbl;
QUERY PLAN
---------------------------------------------------------------
Seq Scan on public.int4_tbl (cost=0.00..1.06 rows=5 width=4)
Output: (f1 + f1)
(2 rows)So functions.c shouldn't have any involvement at all in the
actually-executed PERFORM expression, and whatever difference
you measured must have been noise. (If the effect *is* real,
we'd better find out why.)You need to test with a non-inline-able function. Looking
at the inlining conditions in inline_function(), one simple
hack is to make the function return SETOF. That'll only
exercise the returns-set path in functions.c though, so it'd
be advisable to check other inline-blocking conditions too.
I am sorry. I was wrong - I tested inlining on different case
(2025-02-03 17:24:25) postgres=# explain analyze verbose select fx(i) from
generate_series(1,10) g(i);
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN
│
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Function Scan on pg_catalog.generate_series g (cost=0.00..0.13 rows=10
width=4) (actual time=0.016..0.018 rows=10 loops=1) │
│ Output: (i + i)
│
│ Function Call: generate_series(1, 10)
│
│ Planning:
│
│ Buffers: shared hit=11
│
│ Planning Time: 0.190 ms
│
│ Execution Time: 0.066 ms
│
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(7 rows)
(2025-02-03 17:25:06) postgres=# explain analyze verbose select
fx((random()*100)::int) from generate_series(1,10) g(i);
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN
│
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Function Scan on pg_catalog.generate_series g (cost=0.00..2.68 rows=10
width=4) (actual time=0.104..0.169 rows=10 loops=1) │
│ Output: fx(((random() * '100'::double precision))::integer)
│
│ Function Call: generate_series(1, 10)
│
│ Planning Time: 0.054 ms
│
│ Execution Time: 0.182 ms
│
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(5 rows)
I read https://wiki.postgresql.org/wiki/Inlining_of_SQL_functions, and I
don't remember the rule `if an actual argument to the function call is a
volatile expression, then it must not be referenced in the body more than
once` well, so I didn't apply this rule correctly. I'll recheck this test.
Regards
Pavel
Show quoted text
regards, tom lane
Hi
I did multiple benchmarking, and still looks so the proposed patch doesn't
help and has significant overhead
testcase:
create or replace function fx(int) returns int as $$ select $1 + $1; $$
language sql immutable;
create or replace function fx2(int) returns int as $$ select 2 * $1; $$
language sql immutable;
I tested
do $$
begin
for i in 1..1000000 loop
perform fx((random()*100)::int); -- or fx2
end loop;
end;
$$;
Results (master, patched):
fx: 17067 ms, 22165 ms
fx2: 2234 ms, 2311 ms
the execution of dynamic sql
2025-02-03 18:47:33) postgres=# do $$
begin
for i in 1..1000000 loop
execute 'select $1 + $1' using (random()*100)::int;
end loop;
end;
$$;
DO
Time: 13412.990 ms (00:13.413)
In the profiler I see a significant overhead of the parser, so it looks
like there is some more (overhead), but plan cache is not used.
Please, can somebody recheck my tests?
Regards
Pavel
Pavel Stehule <pavel.stehule@gmail.com> writes:
I did multiple benchmarking, and still looks so the proposed patch doesn't
help and has significant overhead
Yeah, your fx() test case is clearly worse. For me,
HEAD:
regression=# do $$
begin
for i in 1..1000000 loop
perform fx((random()*100)::int); -- or fx2
end loop;
end;
$$;
DO
Time: 5229.184 ms (00:05.229)
PATCH:
regression=# do $$
begin
for i in 1..1000000 loop
perform fx((random()*100)::int); -- or fx2
end loop;
end;
$$;
DO
Time: 6934.413 ms (00:06.934)
Adding some debug printout shows me that BuildCachedPlan is called to
construct a custom plan on every single execution, which is presumably
because the patch doesn't make any attempt to carry plancache state
across successive executions of the same query. If we were saving that
state it would have soon switched to a generic plan and then won big.
So, even though I thought we could leave that for later, it seems like
maybe we have to have it before we'll have a committable patch.
There might be some residual inefficiency in there though. In the
unpatched code we'd be calling pg_parse_query and pg_plan_query once
per execution. You'd think that would cost more than BuildCachedPlan,
which can skip the raw-parsing part. Even more interesting, the
patch gets slower yet if we use a new-style SQL function:
regression=# create or replace function fx3 (int) returns int immutable
regression-# begin atomic select $1 + $1; end;
CREATE FUNCTION
Time: 0.813 ms
regression=# do $$
begin
for i in 1..1000000 loop
perform fx3((random()*100)::int); -- or fx2
end loop;
end;
$$;
DO
Time: 8007.062 ms (00:08.007)
That makes no sense either, because with a new-style SQL
function we should be skipping parse analysis as well.
But wait: HEAD takes
Time: 6632.709 ms (00:06.633)
to do the same thing. So somehow the new-style SQL function
stuff is very materially slower in this use-case, with or
without this patch. I do not understand why.
Definitely some performance investigation needs to be done here.
Even without cross-query plan caching, I don't see why the
patch isn't better than it is. It ought to be at least
competitive with the unpatched code.
(I've not read the v5 patch yet, so I have no theories.)
regards, tom lane
I wrote:
But wait: HEAD takes
Time: 6632.709 ms (00:06.633)
to do the same thing. So somehow the new-style SQL function
stuff is very materially slower in this use-case, with or
without this patch. I do not understand why.
"perf" tells me that in the fx3 test, a full third of the runtime
is spent inside stringToNode(), with about three-quarters of that
going into pg_strtok(). This makes no sense to me: we'll be reading
the prosqlbody of fx3(), sure, but that's not enormously long (about
1200 bytes). And pg_strtok() doesn't look remarkably slow. There's
no way this should be taking more time than raw parsing + parse
analysis, even for such a trivial query as "select $1 + $1".
There's been some talk of getting rid of our existing nodetree
storage format in favor of something more efficient. Maybe we
should put a higher priority on getting that done. But anyway,
that seems orthogonal to the current patch.
Even without cross-query plan caching, I don't see why the
patch isn't better than it is. It ought to be at least
competitive with the unpatched code.
This remains true.
regards, tom lane
Hi, folks.
I've looked through performance and found that most performance issues
was caused by CachedPlanSource machinery itself. At least, a lot of it.
And so decided to go along and implement plan cache for sql functions.
I'm not sure that it's clean enough, but at least it seems to be
working. While working at it I've found issues in
RevalidateCachedQuery() and fixed them. What have changed:
- now plans are released after execution;
- revalidation now takes locks on analyzed_tree;
- query tree is copied prior to analyzing in RevalidateCachedQuery();
- queryTree_list in SQLFunctionCache is no necessary and so has gone.
Now sql functions plans are actually saved. The most of it is a
simplified version of plpgsql plan cache. Perhaps, I've missed
something.
We have some cases when we don't save CachedPlanSource for future use.
One case is for trigger functions (pl_comp.c has some logic to handle
them specially, and I didn't want to introduce it so far). Another (more
interesting) issue is invalidation. SQL functions have a feature of
rewriting query when targetlist doesn't match function call context. I
haven't thought this through carefully during last patch version, but
luckily got some tests, which showed this behavior. When compiled with
RANDOMIZE_ALLOCATED_MEMORY, the following test case dumped core (because
after invalidation executor got "name" fields, but expected text):
create table t (nspname text, tname text);
CREATE OR REPLACE FUNCTION get_basic_attributes_from_pg_tables(
_schemaname name, _tablename name)
RETURNS TABLE(tname text, tablespace text, owner text)
LANGUAGE sql
AS $function$
SELECT
schemaname || '.' || tablename AS "full name",
tablespace AS "tablespace",
tableowner AS "tableowner"
FROM pg_tables
WHERE pg_tables.schemaname = _schemaname AND pg_tables.tablename =
_tablename
ORDER BY 1;
$function$;
create or replace function trg_func() RETURNS TRIGGER
AS
$$
declare
t record;
begin
FOR t IN (SELECT * FROM
get_basic_attributes_from_pg_tables(new.nspname, new.tname)) LOOP
RAISE WARNING '"% % %"', t.owner, t.tablespace, t.tname;
END LOOP;
RETURN NEW;
END
$$ LANGUAGE PLPGSQL;
create trigger t_ins_t AFTER INSERT ON t FOR EACH ROW EXECUTE FUNCTION
trg_func();
set debug_discard_caches to 1;
insert into t values('pg_catalog', 'pg_class');
It's harder to achieve this without permanent cache (harder to trigger
revalidation), but it's still possible.
What happened here is that during revalidation query plan was rebuilt,
but modifications to query tree, made by check_sql_fn_retval() , were
lost.
To fix this issue:
1) We avoid caching modified plans (and check_sql_fn_retval() now
reports if it touched a query tree);
2) For non-cached plans we still need a hack (callback) into
CachedPlanSource to rebuild query tree if invalidation happens. This
callback rebuilds query tree, using check_sql_fn_retval(). We sure that
callback parameters, specifying actual function return type, should not
be recalculated, as such plans can appear only during one function
execution and are not reused.
3) To prove that result type was not changed between plans execution, we
build plans with fixed_result = true.
4) When we get saved plan, prior to using it, we check that result tlist
matches the one built while planning function execution. Otherwise, we
recreate CachedPlanSource.
Well, it appeared more complicated than I've expected, but now it seems
simple SQL functions have much better performance.
create or replace function fx(int) returns int as $$ select $1 + $1; $$
language sql immutable;
create or replace function fx2(int) returns int as $$ select 2 * $1; $$
language sql immutable;
create or replace function fx3 (int) returns int immutable begin atomic
select $1 + $1; end;
create or replace function fx4(int) returns numeric as $$ select $1 +
$1; $$ language sql immutable;
-- sql function
do $$
begin
for i in 1..1000000 loop
perform fx((random()*100)::int);
end loop;
end;
$$;
Time: 3008.869 ms (00:03.009)
-- dynamic SQL
do
$$ begin
for i in 1..1000000 loop
execute 'select $1 + $1' using (random()*100)::int;
end loop;
end;
$$;
Time: 4915.295 ms (00:04.915)do $$
-- pre-parsed function
begin
for i in 1..1000000 loop
perform fx3((random()*100)::int);
end loop;
end;
$$;
Time: 2992.166 ms (00:02.992)
-- no plan caching due to need in fixing target list:
do $$
begin
for i in 1..1000000 loop
perform fx4((random()*100)::int);
end loop;
end;
$$;
Time: 11020.820 ms (00:11.021)
--
Best regards,
Alexander Pyhalov,
Postgres Professional
Attachments:
v6-0004-Handle-SQL-functions-which-are-modified-between-rewr.patchtext/x-diff; name=v6-0004-Handle-SQL-functions-which-are-modified-between-rewr.patchDownload+165-12
v6-0003-Introduce-SQL-functions-plan-cache.patchtext/x-diff; name=v6-0003-Introduce-SQL-functions-plan-cache.patchDownload+524-137
v6-0002-Use-custom-plan-machinery-for-SQL-function.patchtext/x-diff; name=v6-0002-Use-custom-plan-machinery-for-SQL-function.patchDownload+417-63
v6-0001-Split-out-SQL-functions-checks-from-init_execution_s.patchtext/x-diff; name=v6-0001-Split-out-SQL-functions-checks-from-init_execution_s.patchDownload+36-27
hI
I can confirm 60% speedup for execution of function fx and fx3 - both
functions are very primitive, so for real code the benefit can be higher
Unfortunately, there is about 5% slowdown for inlined code, and for just
plpgsql code too.
I tested fx4
create or replace function fx4(int) returns int immutable as $$ begin
return $1 + $1; end $$ language plpgsql;
and fx2
create or replace function fx2(int) returns int as $$ select 2 * $1; $$
language sql immutable;
and execution of patched code is about 5% slower. It is strange so this
patch has a negative impact on plpgsql execution.
Regards
Pavel
Pavel Stehule писал(а) 2025-02-26 22:34:
hI
I can confirm 60% speedup for execution of function fx and fx3 - both
functions are very primitive, so for real code the benefit can be
higherUnfortunately, there is about 5% slowdown for inlined code, and for
just plpgsql code too.I tested fx4
create or replace function fx4(int) returns int immutable as $$ begin
return $1 + $1; end $$ language plpgsql;and fx2
create or replace function fx2(int) returns int as $$ select 2 * $1;
$$
language sql immutable;and execution of patched code is about 5% slower. It is strange so
this patch has a negative impact on plpgsql execution.Regards
Pavel
Hi. I've tried to reproduce slowdown and couldn't.
create or replace function fx4(int) returns int immutable as $$ begin
return $1 + $1; end $$ language plpgsql;
do $$
begin
for i in 1..5000000 loop
perform fx4((random()*100)::int); -- or fx2
end loop;
end;
$$;
OLD results:
Time: 8268.614 ms (00:08.269)
Time: 8178.836 ms (00:08.179)
Time: 8306.694 ms (00:08.307)
New (patched) results:
Time: 7743.945 ms (00:07.744)
Time: 7803.109 ms (00:07.803)
Time: 7736.735 ms (00:07.737)
Not sure why new is faster (perhaps, some noise?) - looking at perf
flamegraphs I don't see something evident.
create or replace function fx2(int) returns int as $$ select 2 * $1; $$
language sql immutable;
do $$
begin
for i in 1..5000000 loop
perform fx2((random()*100)::int); -- or fx2
end loop;
end;
$$;
OLD results:
Time: 5346.471 ms (00:05.346)
Time: 5359.222 ms (00:05.359)
Time: 5316.747 ms (00:05.317)
New (patched) results:
Time: 5188.363 ms (00:05.188)
Time: 5225.322 ms (00:05.225)
Time: 5203.667 ms (00:05.204)
--
Best regards,
Alexander Pyhalov,
Postgres Professional
čt 27. 2. 2025 v 13:25 odesílatel Alexander Pyhalov <
a.pyhalov@postgrespro.ru> napsal:
Pavel Stehule писал(а) 2025-02-26 22:34:
hI
I can confirm 60% speedup for execution of function fx and fx3 - both
functions are very primitive, so for real code the benefit can be
higherUnfortunately, there is about 5% slowdown for inlined code, and for
just plpgsql code too.I tested fx4
create or replace function fx4(int) returns int immutable as $$ begin
return $1 + $1; end $$ language plpgsql;and fx2
create or replace function fx2(int) returns int as $$ select 2 * $1;
$$
language sql immutable;and execution of patched code is about 5% slower. It is strange so
this patch has a negative impact on plpgsql execution.Regards
Pavel
Hi. I've tried to reproduce slowdown and couldn't.
create or replace function fx4(int) returns int immutable as $$ begin
return $1 + $1; end $$ language plpgsql;do $$
begin
for i in 1..5000000 loop
perform fx4((random()*100)::int); -- or fx2
end loop;
end;
$$;OLD results:
Time: 8268.614 ms (00:08.269)
Time: 8178.836 ms (00:08.179)
Time: 8306.694 ms (00:08.307)New (patched) results:
Time: 7743.945 ms (00:07.744)
Time: 7803.109 ms (00:07.803)
Time: 7736.735 ms (00:07.737)Not sure why new is faster (perhaps, some noise?) - looking at perf
flamegraphs I don't see something evident.create or replace function fx2(int) returns int as $$ select 2 * $1; $$
language sql immutable;
do $$
begin
for i in 1..5000000 loop
perform fx2((random()*100)::int); -- or fx2
end loop;
end;
$$;OLD results:
Time: 5346.471 ms (00:05.346)
Time: 5359.222 ms (00:05.359)
Time: 5316.747 ms (00:05.317)New (patched) results:
Time: 5188.363 ms (00:05.188)
Time: 5225.322 ms (00:05.225)
Time: 5203.667 ms (00:05.204)
I'll try to get profiles.
Regards
Pavel
Show quoted text
--
Best regards,
Alexander Pyhalov,
Postgres Professional