INOUT params with expanded objects

Started by Jim Mlodgenski2 days ago4 messages
#1Jim Mlodgenski
Jim Mlodgenski
jimmy76@gmail.com
1 attachment(s)

I have an extension[1]https://github.com/aws/pgcollection that adds a collection data type using the expanded
object API. Things perform well as it gets passed around inside a plpgsql
function, but when passing it as an INOUT parameter, I'm hitting a double
free. It looks like the object gets transferred to the context of the
procedure which is freed on completion, but still referenced in the calling
function. There certainly can be something in my extension code, but I found
that the attached patch to exec_assign_value does solve the issue. It's likely
not the right solution with taking a sledge hammer to flatten the object but
it's pointing me to think the issue can be on the plpgsql side. Does anyone
have any ideas I can try on the extension side or better way to solve it on
the plpgsql side?

CREATE PROCEDURE prc(INOUT c collection)
AS $$
BEGIN
c := add(c, 'A', 1::int4);
END;
$$ LANGUAGE plpgsql;

DO $$
DECLARE
c collection('int4');
BEGIN
FOR i IN 1..2 LOOP
CALL prc(c);
END LOOP;
END;
$$;

ERROR: pfree called with invalid pointer 0x56424d513148 (header
0x7f7f7f7f7f7f7f7f)
CONTEXT: PL/pgSQL function inline_code_block line 6 at CALL

-- Jim

[1]: https://github.com/aws/pgcollection

Attachments:

inout_expanded_objects.patchapplication/octet-stream; name=inout_expanded_objects.patch
#2Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Mlodgenski (#1)
Re: INOUT params with expanded objects

Jim Mlodgenski <jimmy76@gmail.com> writes:

I have an extension[1] that adds a collection data type using the expanded
object API. Things perform well as it gets passed around inside a plpgsql
function, but when passing it as an INOUT parameter, I'm hitting a double
free.

You really need to show us your C code. I tried this variant of
your test case:

CREATE PROCEDURE prc(INOUT c int[], i int, j int)
AS $$
BEGIN
c[i] := j;
END;
$$ LANGUAGE plpgsql;

DO $$
DECLARE
c int[];
BEGIN
FOR i IN 1..10 LOOP
CALL prc(c, i, i+10);
END LOOP;
RAISE NOTICE 'c = %', c;
END;
$$;

and got what I expected:

NOTICE: c = {11,12,13,14,15,16,17,18,19,20}

Tracing suggests that the expanded array object created by the
subscript assignment is getting flattened on the way out of the
procedure in order to stuff it into the composite value that is the
procedure's actual result. So that's pretty sad from a performance
standpoint: it means we aren't getting any real benefit from the
INOUT variable. But it also means that there's not any sharing of
expanded-object state between the procedure and its caller, so
it's not apparent to me that this example should be bug-prone.

In any case, the lack of failure for an array seems to me to let
plpgsql off the hook. There's something in your extension that's
not doing what's expected.

regards, tom lane

#3Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#2)
Re: INOUT params with expanded objects

I wrote:

Tracing suggests that the expanded array object created by the
subscript assignment is getting flattened on the way out of the
procedure in order to stuff it into the composite value that is the
procedure's actual result. So that's pretty sad from a performance
standpoint: it means we aren't getting any real benefit from the
INOUT variable.

BTW, just to flesh out what "pretty sad" means, here are two
equivalent implementations of my example:

CREATE or replace PROCEDURE prc(INOUT c int[], i int, j int)
AS $$
BEGIN
c[i] := j;
END;
$$ LANGUAGE plpgsql;

CREATE or replace FUNCTION fnc(INOUT c int[], i int, j int)
AS $$
BEGIN
c[i] := j;
END;
$$ LANGUAGE plpgsql;

\timing on

DO $$
DECLARE
c int[];
BEGIN
FOR i IN 1..100000 LOOP
CALL prc(c, i, i+10);
END LOOP;
--RAISE NOTICE 'c = %', c;
END;
$$;

DO $$
DECLARE
c int[];
BEGIN
FOR i IN 1..100000 LOOP
c := fnc(c, i, i+10);
END LOOP;
--RAISE NOTICE 'c = %', c;
END;
$$;

The first DO-block takes about 47 seconds on my workstation
(in an --enable-cassert build, so take that with a grain of
salt, but certainly it's slow). The second takes 50ms.
If I mark the function IMMUTABLE as it really ought to be,
that drops to 45ms.

While I'd be glad to see the procedure's speed improved,
there's a lot standing in the way of making that happen.

Certainly this case shouldn't crash, so there's something
to fix here, but it's best not to use procedures when a
function could serve.

regards, tom lane

#4Jim Mlodgenski
Jim Mlodgenski
jimmy76@gmail.com
In reply to: Tom Lane (#2)
Re: INOUT params with expanded objects

On Wed, Dec 10, 2025 at 7:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Jim Mlodgenski <jimmy76@gmail.com> writes:

I have an extension[1] that adds a collection data type using the expanded
object API. Things perform well as it gets passed around inside a plpgsql
function, but when passing it as an INOUT parameter, I'm hitting a double
free.

You really need to show us your C code. I tried this variant of
your test case:

The code is here for reference
https://github.com/aws/pgcollection/blob/main/src/collection.c

In any case, the lack of failure for an array seems to me to let
plpgsql off the hook. There's something in your extension that's
not doing what's expected.

Thanks for pulling me up from the rabbit hole. It looks like when things
come as an INOUT, they are RO and the extension is not handling that
properly. Twiddling with that solves my double free issue, but has other
impacts based on my bad assumptions. It does set me on the right path
though for a full fix in the extension.

--Jim