memory leak in auto_explain

Started by Jeff Janesalmost 5 years ago5 messages
#1Jeff Janes
jeff.janes@gmail.com

I accidentally tried to populate a test case
while auto_explain.log_min_duration was set to
zero. auto_explain.log_nested_statements was also on.

create or replace function gibberish(int) returns text language SQL as $_$
select left(string_agg(md5(random()::text),$$$$),$1) from
generate_series(0,$1/32) $_$;

create table j1 as select x, md5(random()::text) as t11, gibberish(1500) as
t12 from generate_series(1,20e6) f(x);

I got logorrhea of course, but I also got a memory leak into the SQL
function context:

TopPortalContext: 8192 total in 1 blocks; 7656 free (0 chunks); 536 used
PortalContext: 16384 total in 5 blocks; 5328 free (1 chunks); 11056
used: <unnamed>
ExecutorState: 4810120 total in 13 blocks; 4167160 free (74922
chunks); 642960 used
SQL function: 411058232 total in 60 blocks; 4916568 free (4
chunks); 406141664 used: gibberish

The memory usage grew until OOM killer stepped in.

Cheers,

Jeff

#2Jeff Janes
jeff.janes@gmail.com
In reply to: Jeff Janes (#1)
Re: memory leak in auto_explain

On Mon, Feb 1, 2021 at 6:09 PM Jeff Janes <jeff.janes@gmail.com> wrote:

create or replace function gibberish(int) returns text language SQL as $_$
select left(string_agg(md5(random()::text),$$$$),$1) from
generate_series(0,$1/32) $_$;

create table j1 as select x, md5(random()::text) as t11, gibberish(1500)
as t12 from generate_series(1,20e6) f(x);

I should have added, found it on HEAD, verified it also in 12.5.

Cheers,

Jeff

Show quoted text
#3japin
japinli@hotmail.com
In reply to: Jeff Janes (#2)
1 attachment(s)
Re: memory leak in auto_explain

On Tue, 02 Feb 2021 at 07:12, Jeff Janes <jeff.janes@gmail.com> wrote:

On Mon, Feb 1, 2021 at 6:09 PM Jeff Janes <jeff.janes@gmail.com> wrote:

create or replace function gibberish(int) returns text language SQL as $_$
select left(string_agg(md5(random()::text),$$$$),$1) from
generate_series(0,$1/32) $_$;

create table j1 as select x, md5(random()::text) as t11, gibberish(1500)
as t12 from generate_series(1,20e6) f(x);

I should have added, found it on HEAD, verified it also in 12.5.

Here's my analysis:
1) In the explain_ExecutorEnd(), it will create a ExplainState on SQL function
memory context, which is a long-lived, cause the memory grow up.

/*
* Switch to context in which the fcache lives. This ensures that our
* tuplestore etc will have sufficient lifetime. The sub-executor is
* responsible for deleting per-tuple information. (XXX in the case of a
* long-lived FmgrInfo, this policy represents more memory leakage, but
* it's not entirely clear where to keep stuff instead.)
*/
oldcontext = MemoryContextSwitchTo(fcache->fcontext);

2) I try to call pfree() to release ExplainState memory, however, it does not
make sence, I do not know why this does not work? So I try to create it in
queryDesc->estate->es_query_cxt memory context like queryDesc->totaltime, and
it works.

Attached fix the memory leakage in auto_explain. Any thoughts?

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

Attachments:

v1-0001-Fix-memory-leak-in-auto_explain.patchtext/x-patchDownload
From 2589cf9c78bbd308b6f300277500c920152ee6f2 Mon Sep 17 00:00:00 2001
From: Japin Li <japinli@hotmail.com>
Date: Tue, 2 Feb 2021 17:27:21 +0800
Subject: [PATCH v1] Fix memory leak in auto_explain

---
 contrib/auto_explain/auto_explain.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index faa6231d87..033540a3f4 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -383,6 +383,7 @@ explain_ExecutorEnd(QueryDesc *queryDesc)
 		msec = queryDesc->totaltime->total * 1000.0;
 		if (msec >= auto_explain_log_min_duration)
 		{
+			MemoryContext oldctx = MemoryContextSwitchTo(queryDesc->estate->es_query_cxt);
 			ExplainState *es = NewExplainState();
 
 			es->analyze = (queryDesc->instrument_options && auto_explain_log_analyze);
@@ -426,6 +427,7 @@ explain_ExecutorEnd(QueryDesc *queryDesc)
 					 errhidestmt(true)));
 
 			pfree(es->str->data);
+			MemoryContextSwitchTo(oldctx);
 		}
 	}
 
-- 
2.30.0

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: japin (#3)
Re: memory leak in auto_explain

japin <japinli@hotmail.com> writes:

Here's my analysis:
1) In the explain_ExecutorEnd(), it will create a ExplainState on SQL function
memory context, which is a long-lived, cause the memory grow up.

Yeah, agreed. I think people looking at this have assumed that the
ExecutorEnd hook would automatically be running in the executor's
per-query context, but that's not so; we haven't yet entered
standard_ExecutorEnd where the context switch is. The caller's
context is likely to be much longer-lived than the executor's.

I think we should put the switch one level further out than you have
it here, just to be sure that InstrEndLoop is covered too (that doesn't
allocate memory, but auto_explain shouldn't assume that). Otherwise
seems like a good fix.

regards, tom lane

#5japin
japinli@hotmail.com
In reply to: Tom Lane (#4)
Re: memory leak in auto_explain

On Wed, 03 Feb 2021 at 02:13, Tom Lane <tgl@sss.pgh.pa.us> wrote:

japin <japinli@hotmail.com> writes:

Here's my analysis:
1) In the explain_ExecutorEnd(), it will create a ExplainState on SQL function
memory context, which is a long-lived, cause the memory grow up.

Yeah, agreed. I think people looking at this have assumed that the
ExecutorEnd hook would automatically be running in the executor's
per-query context, but that's not so; we haven't yet entered
standard_ExecutorEnd where the context switch is. The caller's
context is likely to be much longer-lived than the executor's.

I think we should put the switch one level further out than you have
it here, just to be sure that InstrEndLoop is covered too (that doesn't
allocate memory, but auto_explain shouldn't assume that). Otherwise
seems like a good fix.

Thanks for your review and clarification.

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.