memory leak in auto_explain

Started by Jeff Janesover 5 years ago5 messageshackers

jeff.janes@gmail.com

over 5 years ago

I accidentally tried to populate a test case
while auto_explain.log_min_duration was set to
zero. auto_explain.log_nested_statements was also on.

create or replace function gibberish(int) returns text language SQL as $_$
select left(string_agg(md5(random()::text),$$$$),$1) from
generate_series(0,$1/32) $_$;

create table j1 as select x, md5(random()::text) as t11, gibberish(1500) as
t12 from generate_series(1,20e6) f(x);

I got logorrhea of course, but I also got a memory leak into the SQL
function context:

TopPortalContext: 8192 total in 1 blocks; 7656 free (0 chunks); 536 used
PortalContext: 16384 total in 5 blocks; 5328 free (1 chunks); 11056
used: <unnamed>
ExecutorState: 4810120 total in 13 blocks; 4167160 free (74922
chunks); 642960 used
SQL function: 411058232 total in 60 blocks; 4916568 free (4
chunks); 406141664 used: gibberish

The memory usage grew until OOM killer stepped in.

Cheers,

Jeff

Jeff Janes

jeff.janes@gmail.com

over 5 years ago

In reply to: Jeff Janes (#1)

Re: memory leak in auto_explain

On Mon, Feb 1, 2021 at 6:09 PM Jeff Janes <jeff.janes@gmail.com> wrote:

create or replace function gibberish(int) returns text language SQL as $_$
select left(string_agg(md5(random()::text),$$$$),$1) from
generate_series(0,$1/32) $_$;

create table j1 as select x, md5(random()::text) as t11, gibberish(1500)
as t12 from generate_series(1,20e6) f(x);

I should have added, found it on HEAD, verified it also in 12.5.

Cheers,

Jeff

Show quoted text

Japin Li

japinli@hotmail.com

over 5 years ago

In reply to: Jeff Janes (#2)

Re: memory leak in auto_explain

On Tue, 02 Feb 2021 at 07:12, Jeff Janes <jeff.janes@gmail.com> wrote:

On Mon, Feb 1, 2021 at 6:09 PM Jeff Janes <jeff.janes@gmail.com> wrote:

create or replace function gibberish(int) returns text language SQL as $_$
select left(string_agg(md5(random()::text),$$$$),$1) from
generate_series(0,$1/32) $_$;

create table j1 as select x, md5(random()::text) as t11, gibberish(1500)
as t12 from generate_series(1,20e6) f(x);

I should have added, found it on HEAD, verified it also in 12.5.

Here's my analysis:
1) In the explain_ExecutorEnd(), it will create a ExplainState on SQL function
memory context, which is a long-lived, cause the memory grow up.

/*
* Switch to context in which the fcache lives. This ensures that our
* tuplestore etc will have sufficient lifetime. The sub-executor is
* responsible for deleting per-tuple information. (XXX in the case of a
* long-lived FmgrInfo, this policy represents more memory leakage, but
* it's not entirely clear where to keep stuff instead.)
*/
oldcontext = MemoryContextSwitchTo(fcache->fcontext);

2) I try to call pfree() to release ExplainState memory, however, it does not
make sence, I do not know why this does not work? So I try to create it in
queryDesc->estate->es_query_cxt memory context like queryDesc->totaltime, and
it works.

Attached fix the memory leakage in auto_explain. Any thoughts?

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

Tom Lane

tgl@sss.pgh.pa.us

over 5 years ago

In reply to: Japin Li (#3)

Re: memory leak in auto_explain

japin <japinli@hotmail.com> writes:

Here's my analysis:
1) In the explain_ExecutorEnd(), it will create a ExplainState on SQL function
memory context, which is a long-lived, cause the memory grow up.

Yeah, agreed. I think people looking at this have assumed that the
ExecutorEnd hook would automatically be running in the executor's
per-query context, but that's not so; we haven't yet entered
standard_ExecutorEnd where the context switch is. The caller's
context is likely to be much longer-lived than the executor's.

I think we should put the switch one level further out than you have
it here, just to be sure that InstrEndLoop is covered too (that doesn't
allocate memory, but auto_explain shouldn't assume that). Otherwise
seems like a good fix.

regards, tom lane

Japin Li

japinli@hotmail.com

over 5 years ago

In reply to: Tom Lane (#4)

Re: memory leak in auto_explain

On Wed, 03 Feb 2021 at 02:13, Tom Lane <tgl@sss.pgh.pa.us> wrote:

japin <japinli@hotmail.com> writes:

Here's my analysis:
1) In the explain_ExecutorEnd(), it will create a ExplainState on SQL function
memory context, which is a long-lived, cause the memory grow up.

Yeah, agreed. I think people looking at this have assumed that the
ExecutorEnd hook would automatically be running in the executor's
per-query context, but that's not so; we haven't yet entered
standard_ExecutorEnd where the context switch is. The caller's
context is likely to be much longer-lived than the executor's.

I think we should put the switch one level further out than you have
it here, just to be sure that InstrEndLoop is covered too (that doesn't
allocate memory, but auto_explain shouldn't assume that). Otherwise
seems like a good fix.

Thanks for your review and clarification.

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

memory leak in auto_explain

Attachments: