Options to rowwise persist result of stable/immutable function with RECORD result
Hi,
I have noticed a rather odd behaviour that is not strictly a bug, but is
unexpected.
It is when a immutable (or stable) PG function is returning results in a
record structure a select on these calls the function repeatedly for each
element in the output record.
See below for an example.
Sure I can work around this by returning in an array, or materialised as a
whole by e.g. a materialised CTE, but what I'm looking for is *materialising
of just the individual row *during processing, if the function is to be
called on many rows.
Obviously in theory the returned record could be very complex, so we might
not want it materialised in general, but an option to do so would be nice.
I would suggest that a WITH could be marked with a new "MATERIALIZED *ROW*"
option (reusing already reserved keywords).
Note how I below have set the cost extreme, in this test, the value does
not affect the behaviour..
The result set here have five elements, if i change the type to VOLATILE,
the execution time is reduced by a factor of five (see the difference
between the stamp of line one and two). It is directly proportional to the
number of elements requested from the record (here I requested all)
(The real life scenario is a function that by a list of reg_ex expessions,
splits up the input in numerous fields, And I noticed the behaviour as a
raise function added for debug, put out the same repeatedly.)
-----------------
DROP TYPE IF EXISTS septima.foo_type CASCADE;
CREATE TYPE septima.foo_type AS (a text, b text, c text, d text, e text);
DROP FUNCTION IF EXISTS septima.foo(text);
CREATE OR REPLACE FUNCTION septima.foo(inp text) RETURNS septima.foo_type
AS
$BODY$
DECLARE
result_record septima.foo_type;
i BIGINT :=12345678;
BEGIN
WHILE 0<i LOOP
i=i-1;
END LOOP;
RETURN result_record;
END
$BODY$
LANGUAGE plpgsql IMMUTABLE
COST 1234567890;
;
WITH x AS (
SELECT * FROM (
SELECT clock_timestamp() rowstart, (g).*, clock_timestamp() rowend FROM
(
SELECT septima.foo(inp) g FROM (
SELECT '1' inp UNION
SELECT '2')
y) x
) x
)
SELECT * FROM x;
DROP TYPE IF EXISTS septima.foo_type CASCADE;
Med venlig hilsen
*Eske Rahn*
Seniorkonsulent
+45 93 87 96 30
eske@septima.dk
--------------------------
Septima P/S
Frederiksberggade 19, 2. sal
1459 København K
+45 72 30 06 72
https://septima.dk
On Tuesday, March 21, 2023, Eske Rahn <eske@septima.dk> wrote:
Hi,
I have noticed a rather odd behaviour that is not strictly a bug, but is
unexpected.It is when a immutable (or stable) PG function is returning results in a
record structure a select on these calls the function repeatedly for each
element in the output record.
The LATERAL join modifier exists to handle this kind of situation.
David J.
Hi,
Thanks for the quick answer *:-D*
That was a nice sideeffect of lateral.
In the example, the calling code also gets simplified:
WITH x AS (
SELECT clock_timestamp() rowstart, *, clock_timestamp() rowend FROM (
SELECT '1' inp UNION
SELECT '2'
) y, LATERAL septima.foo(inp) g
)
SELECT * FROM x;
That solved the issue at hand, in a much better way. Thanks
Though I still fail to see *why* the other way should generally call the
function for every column in the *result* record - if the function is
STABLE or IMMUTABLE.
BUT as I can not think up a sensible example where LATERAL will *not* do
the trick, so the oddity becomes academic.
So just a thing to remember: *always use lateral with functions with record
result types* - unless they are volatile)
Med venlig hilsen
*Eske Rahn*
Seniorkonsulent
+45 93 87 96 30
eske@septima.dk
--------------------------
Septima P/S
Frederiksberggade 19, 2. sal
1459 København K
+45 72 30 06 72
https://septima.dk
On Wed, Mar 22, 2023 at 10:50 PM David G. Johnston <
david.g.johnston@gmail.com> wrote:
Show quoted text
On Tuesday, March 21, 2023, Eske Rahn <eske@septima.dk> wrote:
Hi,
I have noticed a rather odd behaviour that is not strictly a bug, but is
unexpected.It is when a immutable (or stable) PG function is returning results in a
record structure a select on these calls the function repeatedly for each
element in the output record.The LATERAL join modifier exists to handle this kind of situation.
David J.
On Wed, Mar 22, 2023 at 4:32 PM Eske Rahn <eske@septima.dk> wrote:
Hi,
Thanks for the quick answer *:-D*
That was a nice sideeffect of lateral.
In the example, the calling code also gets simplified:
WITH x AS (
SELECT clock_timestamp() rowstart, *, clock_timestamp() rowend FROM (
SELECT '1' inp UNION
SELECT '2'
) y, LATERAL septima.foo(inp) g
)
SELECT * FROM x;That solved the issue at hand, in a much better way. Thanks
Though I still fail to see *why* the other way should generally call the
function for every column in the *result* record - if the function is
STABLE or IMMUTABLE.
It gets rewritten to be effectively:
select func_call(...).col1, func_call(...).col2, func_call(...).col3
under the assumption that repeating the function call will be cheap and
side-effect free. It was never ideal but fixing that form of optimization
was harder than implementing LATERAL where the multi-column result has a
natural output in the form of a multi-column table. A normal function call
in the target list really means "return a single value" which is at odds
with writing .* after it.
David J.
On Wed, Mar 22, 2023 at 4:46 PM David G. Johnston <
david.g.johnston@gmail.com> wrote:
On Wed, Mar 22, 2023 at 4:32 PM Eske Rahn <eske@septima.dk> wrote:
Hi,
Thanks for the quick answer *:-D*
That was a nice sideeffect of lateral.
In the example, the calling code also gets simplified:
WITH x AS (
SELECT clock_timestamp() rowstart, *, clock_timestamp() rowend FROM (
SELECT '1' inp UNION
SELECT '2'
) y, LATERAL septima.foo(inp) g
)
SELECT * FROM x;That solved the issue at hand, in a much better way. Thanks
Though I still fail to see *why* the other way should generally call the
function for every column in the *result* record - if the function is
STABLE or IMMUTABLE.It gets rewritten to be effectively:
select func_call(...).col1, func_call(...).col2, func_call(...).col3
under the assumption that repeating the function call will be cheap and
side-effect free. It was never ideal but fixing that form of optimization
was harder than implementing LATERAL where the multi-column result has a
natural output in the form of a multi-column table. A normal function call
in the target list really means "return a single value" which is at odds
with writing .* after it.
Actually, it is less "optimization" and more "SQL is strongly typed and all
columns must be defined during query compilation".
David J.