Avoiding memory leak when compilation of a function fails

Started by Tom Lane11 months ago3 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

Back in [1]/messages/by-id/20210317055718.v6qs3ltzrformqoa@alap3.anarazel.de, Andres complained that repeated attempts to create
an invalid plpgsql function (one that fails initial compilation)
leak memory, for example

DO $do$
BEGIN
FOR i IN 1 .. 100000 LOOP
BEGIN
CREATE OR REPLACE FUNCTION foo() RETURNS VOID
LANGUAGE plpgsql AS $f$BEGIN frakbar; END;$f$;
EXCEPTION WHEN others THEN
END;
END LOOP;
END;$do$;

The reason is that we create the long-lived function cache context
and detect the syntax error only while trying to fill it in.
As I remarked at the time, we could make that better by making
the cache context initially short-lived and reparenting it only
after it's known good. The attached patch does that.

I noted that the CachedFunction struct made by funccache.c gets
leaked too. (That's not new, but the blame used to fall on plpgsql's
equivalent of that code.) That's not hard to fix in typical cases,
at the price of an extra PG_TRY, which seems okay in a code path that
is setting up a long-lived cache entry. Also done in the attached.

I thought that SQL-language functions might have this issue too,
but they do not, because sql_compile_callback already uses the
reparenting trick. (I followed its lead in making the function
contexts live under CacheMemoryContext not TopMemoryContext.)

If you run the above example long enough, you will also observe a
slow leak in TopTransactionContext. AFAICT that is from accumulating
invalidation messages from the failed pg_proc insertions, so it's not
specific to functions but applies to any DDL in a loop. Fixing that
seems outside the scope of this patch.

I think this is a bug fix, so I'm inclined to squeeze it into v18.
I am not sure if it's worth developing a back-patchable version.
The pl_comp.c change probably applies easily further back, and
would be enough to get the bulk of the benefit.

Comments?

regards, tom lane

[1]: /messages/by-id/20210317055718.v6qs3ltzrformqoa@alap3.anarazel.de

Attachments:

v1-avoid-leak-when-compiling-broken-function.patchtext/x-diff; charset=us-ascii; name=v1-avoid-leak-when-compiling-broken-function.patchDownload+42-11
#2Amul Sul
sulamul@gmail.com
In reply to: Tom Lane (#1)
Re: Avoiding memory leak when compilation of a function fails

On Tue, May 27, 2025 at 4:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Back in [1], Andres complained that repeated attempts to create
an invalid plpgsql function (one that fails initial compilation)
leak memory, for example

DO $do$
BEGIN
FOR i IN 1 .. 100000 LOOP
BEGIN
CREATE OR REPLACE FUNCTION foo() RETURNS VOID
LANGUAGE plpgsql AS $f$BEGIN frakbar; END;$f$;
EXCEPTION WHEN others THEN
END;
END LOOP;
END;$do$;

The reason is that we create the long-lived function cache context
and detect the syntax error only while trying to fill it in.
As I remarked at the time, we could make that better by making
the cache context initially short-lived and reparenting it only
after it's known good. The attached patch does that.

I noted that the CachedFunction struct made by funccache.c gets
leaked too. (That's not new, but the blame used to fall on plpgsql's
equivalent of that code.) That's not hard to fix in typical cases,
at the price of an extra PG_TRY, which seems okay in a code path that
is setting up a long-lived cache entry. Also done in the attached.

I thought that SQL-language functions might have this issue too,
but they do not, because sql_compile_callback already uses the
reparenting trick. (I followed its lead in making the function
contexts live under CacheMemoryContext not TopMemoryContext.)

If you run the above example long enough, you will also observe a
slow leak in TopTransactionContext. AFAICT that is from accumulating
invalidation messages from the failed pg_proc insertions, so it's not
specific to functions but applies to any DDL in a loop. Fixing that
seems outside the scope of this patch.

I think this is a bug fix, so I'm inclined to squeeze it into v18.
I am not sure if it's worth developing a back-patchable version.
The pl_comp.c change probably applies easily further back, and
would be enough to get the bulk of the benefit.

Comments?

The patch seems reasonable and the changes appear straightforward
enough for a backport. However, I am not sure about the backporting,
as the leak doesn't seem to occur very frequently.

Regards,
Amul

#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Amul Sul (#2)
Re: Avoiding memory leak when compilation of a function fails

út 27. 5. 2025 v 7:27 odesílatel Amul Sul <sulamul@gmail.com> napsal:

On Tue, May 27, 2025 at 4:23 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Back in [1], Andres complained that repeated attempts to create
an invalid plpgsql function (one that fails initial compilation)
leak memory, for example

DO $do$
BEGIN
FOR i IN 1 .. 100000 LOOP
BEGIN
CREATE OR REPLACE FUNCTION foo() RETURNS VOID
LANGUAGE plpgsql AS $f$BEGIN frakbar; END;$f$;
EXCEPTION WHEN others THEN
END;
END LOOP;
END;$do$;

The reason is that we create the long-lived function cache context
and detect the syntax error only while trying to fill it in.
As I remarked at the time, we could make that better by making
the cache context initially short-lived and reparenting it only
after it's known good. The attached patch does that.

I noted that the CachedFunction struct made by funccache.c gets
leaked too. (That's not new, but the blame used to fall on plpgsql's
equivalent of that code.) That's not hard to fix in typical cases,
at the price of an extra PG_TRY, which seems okay in a code path that
is setting up a long-lived cache entry. Also done in the attached.

I thought that SQL-language functions might have this issue too,
but they do not, because sql_compile_callback already uses the
reparenting trick. (I followed its lead in making the function
contexts live under CacheMemoryContext not TopMemoryContext.)

If you run the above example long enough, you will also observe a
slow leak in TopTransactionContext. AFAICT that is from accumulating
invalidation messages from the failed pg_proc insertions, so it's not
specific to functions but applies to any DDL in a loop. Fixing that
seems outside the scope of this patch.

I think this is a bug fix, so I'm inclined to squeeze it into v18.
I am not sure if it's worth developing a back-patchable version.
The pl_comp.c change probably applies easily further back, and
would be enough to get the bulk of the benefit.

Comments?

The patch seems reasonable and the changes appear straightforward
enough for a backport. However, I am not sure about the backporting,
as the leak doesn't seem to occur very frequently.

+1

Regards

Pavel

Show quoted text

Regards,
Amul