right way of using case-expressions in plpgsql functions

Started by Victor Dobrovolskyover 2 years ago8 messagesgeneral
Jump to latest
#1Victor Dobrovolsky
booby.stager@gmail.com

I understood from documentation that case expression can be
1) optimized by planner
2) may compute its subexpressions in advance, in case of presence
aggregation functions in them, for example.

The question is - how it is combined with generic prepared plans in
pl/pgsql.
How can I deduct - when using case-expression is "safe" regarding query
goals with parameters in pl/pgsql and when is not.
There are two cases - expression in select list and expression in where
clause.

in where clause:
suppose I have a sql-function like this:

CREATE OR REPLACE FUNCTION nvl_in_where(pf anycompatible, px anycompatible,
py anycompatible)
RETURNS boolean
LANGUAGE sql
STABLE
AS $function$
select (case when px is not null then pf = px
else pf is not distinct from py
end);
$function$
;

and then I use it in some pl/pgsql function:

CREATE OR REPLACE FUNCTION plsql_query_function(in pn numeric )
RETURNS boolean
LANGUAGE plpgsql
STABLE
AS $function$
Declare
sr record;
Begin

For sr in Select tbl.p1,tbl.p2
From tbl
Where
nvl_in_where(tbl.p1, pn, tbl.p1)
Loop
-- do some logic with sr ...
-- ...
Null;
end loop;
end;
$function$
;

If execute this query individually with fixed value of parameter $1, the
query plan
would be like

Select tbl.p1,tbl.p2
From tbl
Where
tbl.p1 = pn::numeric
;

or

Select tbl.p1,tbl.p2
From tbl
Where
tbl.p1 is not distinct from tbl.p1
;

depending if pn is null or not.

The documentation states that after some executions of such functions the
plan should become generic.
What is a generic plan for such a case and how would it work?
If it is safe to use function like nvl_in_where in where clause of queries
in plpgsql function or,
maybe, I should only use that query only with the execute statement in
plpgsql?

in select list:

suppose, i need something like this :

select case $1
when '1'::numeric then tbl.p1
when '2'::numeric then tbl.p2
...
end as cresult
from tbl ...

Can I be sure, that this expression would not be "optimised" in generic
plan just to
select
tbl.some_lucky_fied
from tbl
Can I use this type of expression in the select list regarding generic plans
or should I prefer dynamic execution for such type queries in plpgsql?

Thanks in advance.

PS
The question looks like from a novice, and , indeed, I am.

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Victor Dobrovolsky (#1)
Re: right way of using case-expressions in plpgsql functions

On Sunday, October 15, 2023, Victor Dobrovolsky <booby.stager@gmail.com>
wrote:

select (case when px is not null then pf = px
else pf is not distinct from py
end);

Every single time this function is called “px is not null” will be
evaluated and then one of the two branches will be evaluated. Nothing the
optimizer does will change that. The planner for the function internals
does not know whether px will or will not be null on any given invocation.

David J.

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: David G. Johnston (#2)
Re: right way of using case-expressions in plpgsql functions

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Sunday, October 15, 2023, Victor Dobrovolsky <booby.stager@gmail.com>
wrote:

select (case when px is not null then pf = px
else pf is not distinct from py
end);

Every single time this function is called “px is not null” will be
evaluated and then one of the two branches will be evaluated. Nothing the
optimizer does will change that. The planner for the function internals
does not know whether px will or will not be null on any given invocation.

Not necessarily --- I think the SQL-language function will get inlined
and then there would be opportunity for const-simplification if a
known value is available for px in the outer function.

At least in the px-not-null case, having "pf = px" rather than an
impenetrable CASE condition will probably be enough better for
optimization that the plancache would never choose to switch to a
generic plan. However, that might not be true for the other case,
since we aren't terribly smart about optimizing NOT DISTINCT conditions.
So the performance you get might well vary depending on which case
occurs more often during the first few query runs.

On the whole though, the entire question seems like solving the wrong
problem. If you have to resort to this kind of thing to get your
answers, it's past time to rethink your data representation.

regards, tom lane

#4Victor Dobrovolsky
booby.stager@gmail.com
In reply to: Tom Lane (#3)
Re: right way of using case-expressions in plpgsql functions

First of all, thanks everyone for the answers.

вс, 15 окт. 2023 г. в 20:08, Tom Lane <tgl@sss.pgh.pa.us>:

"David G. Johnston" <david.g.johnston@gmail.com> writes:

On Sunday, October 15, 2023, Victor Dobrovolsky <booby.stager@gmail.com>
wrote:

select (case when px is not null then pf = px
else pf is not distinct from py
end);

Every single time this function is called “px is not null” will be
evaluated and then one of the two branches will be evaluated. Nothing

the

optimizer does will change that. The planner for the function internals
does not know whether px will or will not be null on any given

invocation.

Not necessarily --- I think the SQL-language function will get inlined
and then there would be opportunity for const-simplification if a
known value is available for px in the outer function.

At least in the px-not-null case, having "pf = px" rather than an
impenetrable CASE condition will probably be enough better for
optimization that the plancache would never choose to switch to a
generic plan. However, that might not be true for the other case,
since we aren't terribly smart about optimizing NOT DISTINCT conditions.
So the performance you get might well vary depending on which case
occurs more often during the first few query runs.

Ok. I am a "man coming from Oracle-sql" .
The first case is reminiscence of using oracle nvl "sql-function" in form of
Select *
from tbl
Where tbl.somefield = nvl(:parameter, tbl.somefield)
;
In such a case Oracle will produce generic plan like this:
Filter :parameter is not null
Select * from tbl
Where tbl.somefield = :parameter -- index scan if possible
Union All
Filter :parameter is null
Select * from tbl
Where 1=1 -- table full scan guaranteed
;
Here aligned left Filter conditions assured, that only one branch of union
all will be really taken in each particular execution of query.

Regarding the "pf is not distinct from py" condition - I saw that it was
translated to "not (pf is distinct from py)"
which is totally correct from a mathematical point of view.
But maybe it would be useful to have an independent translation of the
statement that "pf is identical to pf",
to eliminate the condition totally at last....
(don't pay too much attention, this is a superficial newbie opinion, anyway)

In total - should I manually divide these cases in the plpgsql function if
I like to avoid any prepared statements caveats, or should I use
"execute"-statements, if I am lazy enough for that,
or, there is nothing to complain in terms of "generic plan"...

On the whole though, the entire question seems like solving the wrong

problem. If you have to resort to this kind of thing to get your
answers, it's past time to rethink your data representation.

Definitely Yes.
But...
My goal is "to translate" some application "as fast as possible", using "as
few structure transformations as possible".
From that - "short and dirty translation" - point of view - should I prefer
to divide that
$$
Select
case $1
when '1'::numeric then tbl.p1
when '2'::numeric then tbl.p2
...
end as cresult
$$

expression into:
$$
...
if $1 = '1'::numeric then (query_1)
elsif when '2'::numeric then (query_2)
...
end if;
...
$$
If I do not want to use an execute statement for that?

Thank you.

#5Ron
ronljohnsonjr@gmail.com
In reply to: Victor Dobrovolsky (#1)
Re: right way of using case-expressions in plpgsql functions

On 10/15/23 11:19, Victor Dobrovolsky wrote:
[snip]

The documentation states that after some executions of such functions the
plan should become generic.
What is a generic plan for such a case and how would it work?

It's highly dependent on the query

When I see this happen (after we notice that a procedure starts taking a
/long/ time), the query planner flips from a custom plan to a generic plan
after about the fifth execution in a session of a function/procedure.

This will make it calculate the plan every time:
set plan_cache_mode = force_custom_plan;

--
Born in Arizona, moved to Babylonia.

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Victor Dobrovolsky (#4)
Re: right way of using case-expressions in plpgsql functions

Victor Dobrovolsky <booby.stager@gmail.com> writes:

From that - "short and dirty translation" - point of view - should I prefer
to divide that
$$
Select
case $1
when '1'::numeric then tbl.p1
when '2'::numeric then tbl.p2
...
end as cresult
$$

expression into:
$$
...
if $1 = '1'::numeric then (query_1)
elsif when '2'::numeric then (query_2)
...
end if;
...
$$

Yeah, that would probably be a preferable approach if you're hoping
for significantly different query plans for the two cases. PG does
not have the sort of run-time plan choice mechanism that you're
describing for Oracle.

regards, tom lane

#7Victor Dobrovolsky
booby.stager@gmail.com
In reply to: Tom Lane (#6)
Re: right way of using case-expressions in plpgsql functions

Yeah, that would probably be a preferable approach if you're hoping
for significantly different query plans for the two cases.

Thank you.
My goal is to find out some basic rules that could help me to navigate
issues like this.
Regarding generalized plans in general, and the use of the case-expressions
in particular.

пн, 16 окт. 2023 г. в 00:15, Tom Lane <tgl@sss.pgh.pa.us>:

Show quoted text

Victor Dobrovolsky <booby.stager@gmail.com> writes:

From that - "short and dirty translation" - point of view - should I

prefer

to divide that
$$
Select
case $1
when '1'::numeric then tbl.p1
when '2'::numeric then tbl.p2
...
end as cresult
$$

expression into:
$$
...
if $1 = '1'::numeric then (query_1)
elsif when '2'::numeric then (query_2)
...
end if;
...
$$

Yeah, that would probably be a preferable approach if you're hoping
for significantly different query plans for the two cases. PG does
not have the sort of run-time plan choice mechanism that you're
describing for Oracle.

regards, tom lane

#8Victor Dobrovolsky
booby.stager@gmail.com
In reply to: Ron (#5)
Re: right way of using case-expressions in plpgsql functions

Thank you. I'll take it.

пн, 16 окт. 2023 г. в 00:20, Ron <ronljohnsonjr@gmail.com>:

Show quoted text

On 10/15/23 11:19, Victor Dobrovolsky wrote:

[snip]

The documentation states that after some executions of such functions the
plan should become generic.
What is a generic plan for such a case and how would it work?

It's highly dependent on the query

When I see this happen (after we notice that a procedure starts taking a
*long* time), the query planner flips from a custom plan to a generic
plan after about the fifth execution in a session of a function/procedure.

This will make it calculate the plan every time:
set plan_cache_mode = force_custom_plan;

--
Born in Arizona, moved to Babylonia.