BUG #15321: XMLTABLE leaks memory like crazy

Started by PG Bug reporting formover 7 years ago20 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 15321
Logged by: Andrew Gierth
Email address: andrew@tao11.riddles.org.uk
PostgreSQL version: 10.5
Operating system: any
Description:

From a report on IRC:

XMLTABLE runs a lot of setup code in the per-query memory context -
resulting in allocations of copies of namespace names, other values, and
_multiple copies of the passed-in XML document_, which are not freed
anywhere.

Accordingly, virtually any lateral call to XMLTABLE on non-toy amounts of
data will blow up the server memory usage:

select count(*)
from (select ('<rec xmlns="http://foobar&quot;&gt;&#39;
||
repeat('<obj><col1>foo</col1><col2>bar</col2></obj>',10+(i%10))
|| '</rec>')::xml as content
from generate_series(1,1000000) i) s,
xmltable(xmlnamespaces('http://foobar&#39; AS x),
'x:obj'
passing t.content
columns col1 text path 'x:col1'
columns col2 text path 'x:col2'
);
-- uses about 6GB of RAM in my tests

#2Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: PG Bug reporting form (#1)
Re: BUG #15321: XMLTABLE leaks memory like crazy

"PG" == PG Bug reporting form <noreply@postgresql.org> writes:

PG> From a report on IRC:

PG> XMLTABLE runs a lot of setup code in the per-query memory context -
PG> resulting in allocations of copies of namespace names, other
PG> values, and _multiple copies of the passed-in XML document_, which
PG> are not freed anywhere.

Alvaro, I think this comment of yours from when you committed this work
is relevant:

I just pushed XMLTABLE, after some additional changes. Please test
it thoroughly and report any problems.

[...]

Some changes I made:
* I removed the "buildercxt" memory context. It seemed mostly
pointless, and I was disturbed by the MemoryContextResetOnly().
Per-value memory still uses the per-value memory context, but the
rest of the stuff is in the per-query context, which should be
pretty much the same.

A quick reading suggests that the per-value context should have been
changed to not be a child of "buildercxt" (which would avoid the
MemoryContextResetOnly issue - that's a good sign that you've put a
child context under the wrong parent). But the use of the per-query
context instead is exactly what causes this blowup; compare with what
nodeFunctionscan does with its "argcontext" (and the corresponding
comments in ExecMakeTableFunctionResult).

--
Andrew (irc:RhodiumToad)

#3Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Andrew Gierth (#2)
Re: BUG #15321: XMLTABLE leaks memory like crazy

"Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

* I removed the "buildercxt" memory context. It seemed mostly
pointless, and I was disturbed by the MemoryContextResetOnly().
Per-value memory still uses the per-value memory context, but the
rest of the stuff is in the per-query context, which should be
pretty much the same.

Andrew> A quick reading suggests that the per-value context should have
Andrew> been changed to not be a child of "buildercxt" (which would
Andrew> avoid the MemoryContextResetOnly issue - that's a good sign
Andrew> that you've put a child context under the wrong parent). But
Andrew> the use of the per-query context instead is exactly what causes
Andrew> this blowup; compare with what nodeFunctionscan does with its
Andrew> "argcontext" (and the corresponding comments in
Andrew> ExecMakeTableFunctionResult).

And here's a patch (against pg10, but I don't think the code's changed
since) that I think does it the right way.

This works by changing the meaning of perValueCxt - it's now the
per-INPUT-value context (equivalent to FunctionScan's argcontext, but I
didn't change the name for simplicity), not per-output-value; for the
latter, we use the result exprcontext's per-tuple memory like everything
else does (cf. FunctionScan).

I've verified that this fixes the leak; it also passes all other tests I
have thrown at it. Anyone see any issues with this? (CC'ing in Pavel as
original author)

I'll apply it in due course unless anyone says otherwise.

--
Andrew (irc:RhodiumToad)

Attachments:

xmlmem.patchtext/x-patchDownload+9-4
#4Pavel Stehule
pavel.stehule@gmail.com
In reply to: Andrew Gierth (#3)
Re: BUG #15321: XMLTABLE leaks memory like crazy

2018-08-11 5:00 GMT+02:00 Andrew Gierth <andrew@tao11.riddles.org.uk>:

"Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

* I removed the "buildercxt" memory context. It seemed mostly
pointless, and I was disturbed by the MemoryContextResetOnly().
Per-value memory still uses the per-value memory context, but the
rest of the stuff is in the per-query context, which should be
pretty much the same.

Andrew> A quick reading suggests that the per-value context should have
Andrew> been changed to not be a child of "buildercxt" (which would
Andrew> avoid the MemoryContextResetOnly issue - that's a good sign
Andrew> that you've put a child context under the wrong parent). But
Andrew> the use of the per-query context instead is exactly what causes
Andrew> this blowup; compare with what nodeFunctionscan does with its
Andrew> "argcontext" (and the corresponding comments in
Andrew> ExecMakeTableFunctionResult).

And here's a patch (against pg10, but I don't think the code's changed
since) that I think does it the right way.

This works by changing the meaning of perValueCxt - it's now the
per-INPUT-value context (equivalent to FunctionScan's argcontext, but I
didn't change the name for simplicity), not per-output-value; for the
latter, we use the result exprcontext's per-tuple memory like everything
else does (cf. FunctionScan).

I've verified that this fixes the leak; it also passes all other tests I
have thrown at it. Anyone see any issues with this? (CC'ing in Pavel as
original author)

I'll apply it in due course unless anyone says otherwise.

I'll look there this evening.

I am not sure if combination

+    MemoryContextReset(tstate->perValueCxt);
+    MemoryContextSwitchTo(tstate->perValueCxt);

is valid. Usually MemoryContext is reset after some action, not before. But
now, I have not time to look there

Regards

Pavel

Show quoted text

--
Andrew (irc:RhodiumToad)

#5Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#4)
Re: BUG #15321: XMLTABLE leaks memory like crazy

Hi

2018-08-11 9:02 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:

2018-08-11 5:00 GMT+02:00 Andrew Gierth <andrew@tao11.riddles.org.uk>:

"Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

* I removed the "buildercxt" memory context. It seemed mostly
pointless, and I was disturbed by the MemoryContextResetOnly().
Per-value memory still uses the per-value memory context, but the
rest of the stuff is in the per-query context, which should be
pretty much the same.

Andrew> A quick reading suggests that the per-value context should have
Andrew> been changed to not be a child of "buildercxt" (which would
Andrew> avoid the MemoryContextResetOnly issue - that's a good sign
Andrew> that you've put a child context under the wrong parent). But
Andrew> the use of the per-query context instead is exactly what causes
Andrew> this blowup; compare with what nodeFunctionscan does with its
Andrew> "argcontext" (and the corresponding comments in
Andrew> ExecMakeTableFunctionResult).

And here's a patch (against pg10, but I don't think the code's changed
since) that I think does it the right way.

This works by changing the meaning of perValueCxt - it's now the
per-INPUT-value context (equivalent to FunctionScan's argcontext, but I
didn't change the name for simplicity), not per-output-value; for the
latter, we use the result exprcontext's per-tuple memory like everything
else does (cf. FunctionScan).

I've verified that this fixes the leak; it also passes all other tests I
have thrown at it. Anyone see any issues with this? (CC'ing in Pavel as
original author)

I'll apply it in due course unless anyone says otherwise.

I'll look there this evening.

I am not sure if combination

+    MemoryContextReset(tstate->perValueCxt);
+    MemoryContextSwitchTo(tstate->perValueCxt);

is valid. Usually MemoryContext is reset after some action, not before.
But now, I have not time to look there

Regards

 +   MemoryContextReset(tstate->perValueCxt);
+   MemoryContextSwitchTo(tstate->perValueCxt);
+
    PG_TRY();

The reset of memory context is useless, because the reset of perValueCxt is
there already on the end of tfuncFetchRows function

I don't understand to using tuple memory context

        ((TableFuncScan *) (tstate->ss.ps.plan))->tablefunc->ordinalitycol;
-   oldcxt = MemoryContextSwitchTo(tstate->perValueCxt);
+   oldcxt = MemoryContextSwitchTo(econtext->ecxt_per_tuple_memory);

/*
* Keep requesting rows from the table builder until there aren't any.
@@ -493,7 +498,7 @@ tfuncLoadRows(TableFuncScanState *tstate, ExprContext
*econtext)

tuplestore_putvalues(tstate->tupstore, tupdesc, values, nulls);

-       MemoryContextReset(tstate->perValueCxt);
+       MemoryContextReset(econtext->ecxt_per_tuple_memory);
    }

When we are running under perValueCxt, then there changing memory context
is useless

I modified your patch. Please, check it.

Regards

Pavel

Show quoted text

Pavel

--
Andrew (irc:RhodiumToad)

Attachments:

xmlmem-2.patchtext/x-patch; charset=US-ASCII; name=xmlmem-2.patchDownload+6-9
#6Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Pavel Stehule (#5)
Re: BUG #15321: XMLTABLE leaks memory like crazy

"Pavel" == Pavel Stehule <pavel.stehule@gmail.com> writes:

Pavel> + MemoryContextReset(tstate->perValueCxt);
Pavel> + MemoryContextSwitchTo(tstate->perValueCxt);
Pavel> +
Pavel> PG_TRY();

Pavel> The reset of memory context is useless, because the reset of
Pavel> perValueCxt is there already on the end of tfuncFetchRows
Pavel> function

It's overkill, yes, but it's also harmless because resetting a context
that's not been touched since the last reset has very little overhead.

Pavel> I don't understand to using tuple memory context

We still need a per-result-tuple memory context otherwise we're leaking
whatever memory got allocated in each GetValue call into the
per-input-value context. (We can use our projection econtext's per-tuple
memory for this because we haven't done any evaluation of output items
for the current cycle yet at the point this code is reached.)

Again, look at functionscan for how this is supposed to work.

Pavel> When we are running under perValueCxt, then there changing
Pavel> memory context is useless

It's not useless at all; it's needed to avoid excess memory usage when a
single XMLTABLE() call returns many rows.

--
Andrew (irc:RhodiumToad)

#7Pavel Stehule
pavel.stehule@gmail.com
In reply to: Andrew Gierth (#6)
Re: BUG #15321: XMLTABLE leaks memory like crazy

2018-08-12 9:38 GMT+02:00 Andrew Gierth <andrew@tao11.riddles.org.uk>:

"Pavel" == Pavel Stehule <pavel.stehule@gmail.com> writes:

Pavel> + MemoryContextReset(tstate->perValueCxt);
Pavel> + MemoryContextSwitchTo(tstate->perValueCxt);
Pavel> +
Pavel> PG_TRY();

Pavel> The reset of memory context is useless, because the reset of
Pavel> perValueCxt is there already on the end of tfuncFetchRows
Pavel> function

It's overkill, yes, but it's also harmless because resetting a context
that's not been touched since the last reset has very little overhead.

Pavel> I don't understand to using tuple memory context

We still need a per-result-tuple memory context otherwise we're leaking
whatever memory got allocated in each GetValue call into the
per-input-value context. (We can use our projection econtext's per-tuple
memory for this because we haven't done any evaluation of output items
for the current cycle yet at the point this code is reached.)

Again, look at functionscan for how this is supposed to work.

it is done by tuplestore_putvalues

Pavel> When we are running under perValueCxt, then there changing
Pavel> memory context is useless

It's not useless at all; it's needed to avoid excess memory usage when a
single XMLTABLE() call returns many rows.

When this context was not necessary before, then it is not need be used
now. Tuplestore does all work

Show quoted text

--
Andrew (irc:RhodiumToad)

#8Pavel Stehule
pavel.stehule@gmail.com
In reply to: Andrew Gierth (#6)
Re: BUG #15321: XMLTABLE leaks memory like crazy

2018-08-12 9:38 GMT+02:00 Andrew Gierth <andrew@tao11.riddles.org.uk>:

"Pavel" == Pavel Stehule <pavel.stehule@gmail.com> writes:

Pavel> + MemoryContextReset(tstate->perValueCxt);
Pavel> + MemoryContextSwitchTo(tstate->perValueCxt);
Pavel> +
Pavel> PG_TRY();

Pavel> The reset of memory context is useless, because the reset of
Pavel> perValueCxt is there already on the end of tfuncFetchRows
Pavel> function

It's overkill, yes, but it's also harmless because resetting a context
that's not been touched since the last reset has very little overhead.

The overhead is +/- zero. But the code will work exactly same without this
line. So it is little bit confusing to use there.

Show quoted text

Pavel> I don't understand to using tuple memory context

We still need a per-result-tuple memory context otherwise we're leaking
whatever memory got allocated in each GetValue call into the
per-input-value context. (We can use our projection econtext's per-tuple
memory for this because we haven't done any evaluation of output items
for the current cycle yet at the point this code is reached.)

Again, look at functionscan for how this is supposed to work.

Pavel> When we are running under perValueCxt, then there changing
Pavel> memory context is useless

It's not useless at all; it's needed to avoid excess memory usage when a
single XMLTABLE() call returns many rows.

--
Andrew (irc:RhodiumToad)

#9Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Pavel Stehule (#7)
Re: BUG #15321: XMLTABLE leaks memory like crazy

"Pavel" == Pavel Stehule <pavel.stehule@gmail.com> writes:

We still need a per-result-tuple memory context otherwise we're
leaking whatever memory got allocated in each GetValue call into the
per-input-value context. (We can use our projection econtext's
per-tuple memory for this because we haven't done any evaluation of
output items for the current cycle yet at the point this code is
reached.)

Again, look at functionscan for how this is supposed to work.

Pavel> it is done by tuplestore_putvalues

No, tuplestore_putvalues is only responsible for the memory it allocates
itself, which belongs to the tuplestore, it has nothing to do with the
memory allocated by its caller - and XmlTableGetValue does quite a few
allocations.

It's true that the only way (I think) to get a lot of result rows from
XMLTABLE is to use a large input value, the memory usage for which will
dwarf that of the output rows (a ~128MB xml value can easily mean using
~4.5GB of backend memory, which seems quite excessively profligate), but
my tests show that using that value, the lack of a per-GetValue context
reset costs perhaps another ~500MB on top for my specific testcase:

select count(*)
from (select ('<rec xmlns="http://x&quot;&gt;&#39;
|| repeat('<o><c>foo</c></o>',8000000+(i%10))
|| '</rec>')::xml as content
from generate_series(1,10) i offset 0) s,
xmltable(xmlnamespaces('http://x&#39; as x),
'x:o'
passing s.content
columns
col1 text path 'x:c');

It's not useless at all; it's needed to avoid excess memory usage
when a single XMLTABLE() call returns many rows.

Pavel> When this context was not necessary before, then it is not need
Pavel> be used now. Tuplestore does all work

Before, you were using perValueCxt and resetting it once per GetValue
call. Since my patch takes perValueCxt to use for another purpose
instead, it needs to be replaced, and econtext->ecxt_per_tuple_memory
is suitable for the job (and consistent with functionscan).

The tuplestore does not do all the work - just look at XmlTableGetValue,
and notice that it has calls to text_to_cstring, pstrdup,
appendStringInfoText, InputFunctionCall, xml_xmlnodetoxmltype, all of
which will allocate memory in the current memory context. All of that
needs to be reset on a per-output-tuple basis.

--
Andrew (irc:RhodiumToad)

#10Pavel Stehule
pavel.stehule@gmail.com
In reply to: Andrew Gierth (#9)
Re: BUG #15321: XMLTABLE leaks memory like crazy

2018-08-12 11:44 GMT+02:00 Andrew Gierth <andrew@tao11.riddles.org.uk>:

"Pavel" == Pavel Stehule <pavel.stehule@gmail.com> writes:

We still need a per-result-tuple memory context otherwise we're
leaking whatever memory got allocated in each GetValue call into the
per-input-value context. (We can use our projection econtext's
per-tuple memory for this because we haven't done any evaluation of
output items for the current cycle yet at the point this code is
reached.)

Again, look at functionscan for how this is supposed to work.

Pavel> it is done by tuplestore_putvalues

No, tuplestore_putvalues is only responsible for the memory it allocates
itself, which belongs to the tuplestore, it has nothing to do with the
memory allocated by its caller - and XmlTableGetValue does quite a few
allocations.

Maybe I used wrong words. Sorry, my English lang is not good.

In master tfuncLoadRows switch to tstate->perValueCxt. You change it by
switch econtext->ecxt_per_tuple_memory what is wrong I thing.

If we don't change memory context, we will stay inside perValueCxt. And
this context will be cleaned outside.

It's true that the only way (I think) to get a lot of result rows from
XMLTABLE is to use a large input value, the memory usage for which will
dwarf that of the output rows (a ~128MB xml value can easily mean using
~4.5GB of backend memory, which seems quite excessively profligate), but
my tests show that using that value, the lack of a per-GetValue context
reset costs perhaps another ~500MB on top for my specific testcase:

select count(*)
from (select ('<rec xmlns="http://x&quot;&gt;&#39;
|| repeat('<o><c>foo</c></o>',8000000+(i%10))
|| '</rec>')::xml as content
from generate_series(1,10) i offset 0) s,
xmltable(xmlnamespaces('http://x&#39; as x),
'x:o'
passing s.content
columns
col1 text path 'x:c');

It's not useless at all; it's needed to avoid excess memory usage
when a single XMLTABLE() call returns many rows.

Pavel> When this context was not necessary before, then it is not need
Pavel> be used now. Tuplestore does all work

Before, you were using perValueCxt and resetting it once per GetValue
call. Since my patch takes perValueCxt to use for another purpose
instead, it needs to be replaced, and econtext->ecxt_per_tuple_memory
is suitable for the job (and consistent with functionscan).

I think so using cxt_per_tuple_memory is not necessary - and my patch is
working too.

Just I removed MemoryContextReset(tstate->perValueCxt) after
tuplestore_putvalues. It is possible, because tstate->perValueCxt is
cleaned immediately in caller tfuncFetchRows function.

The tuplestore does not do all the work - just look at XmlTableGetValue,
and notice that it has calls to text_to_cstring, pstrdup,
appendStringInfoText, InputFunctionCall, xml_xmlnodetoxmltype, all of
which will allocate memory in the current memory context. All of that
needs to be reset on a per-output-tuple basis.

But it has own context or it can works under perValueCxt

I think using two memory contexts is not necessary, and with just one
context, the code can be simpler.

Regards

Pavel

Show quoted text

--
Andrew (irc:RhodiumToad)

#11Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Pavel Stehule (#10)
Re: BUG #15321: XMLTABLE leaks memory like crazy

"Pavel" == Pavel Stehule <pavel.stehule@gmail.com> writes:

No, tuplestore_putvalues is only responsible for the memory it
allocates itself, which belongs to the tuplestore, it has nothing to
do with the memory allocated by its caller - and XmlTableGetValue
does quite a few allocations.

Pavel> Maybe I used wrong words. Sorry, my English lang is not good.

Pavel> In master tfuncLoadRows switch to tstate->perValueCxt. You
Pavel> change it by switch econtext->ecxt_per_tuple_memory what is
Pavel> wrong I thing.

Pavel> If we don't change memory context, we will stay inside
Pavel> perValueCxt. And this context will be cleaned outside.

Yes, but it won't be cleaned up until _all_ the rows from a single call
have been generated, which means that the allocations from GetValue have
been (uselessly) accumulating during this time, wasting memory.

We do need a context that is reset for every output row, as perValueCxt
used to be. I don't see why this is even in question.

Before, you were using perValueCxt and resetting it once per GetValue
call. Since my patch takes perValueCxt to use for another purpose
instead, it needs to be replaced, and econtext->ecxt_per_tuple_memory
is suitable for the job (and consistent with functionscan).

Pavel> I think so using cxt_per_tuple_memory is not necessary - and my
Pavel> patch is working too.

Working, just using more memory than it needs to.

Pavel> Just I removed MemoryContextReset(tstate->perValueCxt) after
Pavel> tuplestore_putvalues. It is possible, because
Pavel> tstate->perValueCxt is cleaned immediately in caller
Pavel> tfuncFetchRows function.

But that's not "immediately" because tfuncLoadRows is looping over the
FetchRow call, and calling GetValue for each column in that row, and in
your version it is _not cleaning up memory in that loop_.

--
Andrew (irc:RhodiumToad)

#12Pavel Stehule
pavel.stehule@gmail.com
In reply to: Andrew Gierth (#11)
Re: BUG #15321: XMLTABLE leaks memory like crazy

2018-08-12 12:25 GMT+02:00 Andrew Gierth <andrew@tao11.riddles.org.uk>:

"Pavel" == Pavel Stehule <pavel.stehule@gmail.com> writes:

No, tuplestore_putvalues is only responsible for the memory it
allocates itself, which belongs to the tuplestore, it has nothing to
do with the memory allocated by its caller - and XmlTableGetValue
does quite a few allocations.

Pavel> Maybe I used wrong words. Sorry, my English lang is not good.

Pavel> In master tfuncLoadRows switch to tstate->perValueCxt. You
Pavel> change it by switch econtext->ecxt_per_tuple_memory what is
Pavel> wrong I thing.

Pavel> If we don't change memory context, we will stay inside
Pavel> perValueCxt. And this context will be cleaned outside.

Yes, but it won't be cleaned up until _all_ the rows from a single call
have been generated, which means that the allocations from GetValue have
been (uselessly) accumulating during this time, wasting memory.

We do need a context that is reset for every output row, as perValueCxt
used to be. I don't see why this is even in question.

Before, you were using perValueCxt and resetting it once per GetValue
call. Since my patch takes perValueCxt to use for another purpose
instead, it needs to be replaced, and econtext->ecxt_per_tuple_memory
is suitable for the job (and consistent with functionscan).

Pavel> I think so using cxt_per_tuple_memory is not necessary - and my
Pavel> patch is working too.

Working, just using more memory than it needs to.

Pavel> Just I removed MemoryContextReset(tstate->perValueCxt) after
Pavel> tuplestore_putvalues. It is possible, because
Pavel> tstate->perValueCxt is cleaned immediately in caller
Pavel> tfuncFetchRows function.

But that's not "immediately" because tfuncLoadRows is looping over the
FetchRow call, and calling GetValue for each column in that row, and in
your version it is _not cleaning up memory in that loop_.

ok, now I am maybe understand to your motivation.

Usually, loading row should be memory cheap operation, but sure some bytes
it can take.

Then I don't like too much using ecxt_per_tuple_memory for this. Maybe
better to create own short life context for purpose. Or do better comments
about using this memory context for very short life task, please.

Regards

Pavel

Show quoted text

--
Andrew (irc:RhodiumToad)

#13Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Pavel Stehule (#12)
Re: BUG #15321: XMLTABLE leaks memory like crazy

"Pavel" == Pavel Stehule <pavel.stehule@gmail.com> writes:

But that's not "immediately" because tfuncLoadRows is looping over
the FetchRow call, and calling GetValue for each column in that row,
and in your version it is _not cleaning up memory in that loop_.

Pavel> ok, now I am maybe understand to your motivation.

Pavel> Usually, loading row should be memory cheap operation, but sure
Pavel> some bytes it can take.

Yes, it'll usually be small, but you shouldn't assume that (and some
type input functions may use more memory than you think, since doing a
lot of retail pfree() calls can really slow things down).

Pavel> Then I don't like too much using ecxt_per_tuple_memory for this.

It's there, it has the right lifetime, allocating another one just for
this is a waste. Furthermore, this is the same pattern that FunctionScan
uses, so it's more consistent.

Pavel> Or do better comments about using this memory context for very
Pavel> short life task, please.

What specifically do you think needs explaining?

Attached patch is the same logic as before but with more comments.

--
Andrew (irc:RhodiumToad)

Attachments:

xmlmem.patchtext/x-patchDownload+24-4
#14Pavel Stehule
pavel.stehule@gmail.com
In reply to: Andrew Gierth (#13)
Re: BUG #15321: XMLTABLE leaks memory like crazy

2018-08-12 13:18 GMT+02:00 Andrew Gierth <andrew@tao11.riddles.org.uk>:

"Pavel" == Pavel Stehule <pavel.stehule@gmail.com> writes:

But that's not "immediately" because tfuncLoadRows is looping over
the FetchRow call, and calling GetValue for each column in that row,
and in your version it is _not cleaning up memory in that loop_.

Pavel> ok, now I am maybe understand to your motivation.

Pavel> Usually, loading row should be memory cheap operation, but sure
Pavel> some bytes it can take.

Yes, it'll usually be small, but you shouldn't assume that (and some
type input functions may use more memory than you think, since doing a
lot of retail pfree() calls can really slow things down).

Pavel> Then I don't like too much using ecxt_per_tuple_memory for this.

It's there, it has the right lifetime, allocating another one just for
this is a waste. Furthermore, this is the same pattern that FunctionScan
uses, so it's more consistent.

Pavel> Or do better comments about using this memory context for very
Pavel> short life task, please.

What specifically do you think needs explaining?

I don't feel well, when context named GetValue has longer life than
ecxt_tuple_memory context. I understand so it is not important in SRF
function, but it looks strange for me.

It is my subjective option, and I have not any strong arguments for it. If
commiter think so it is ok, then I can live with it :)

Attached patch is the same logic as before but with more comments.

ok

Regards

Pavel

Show quoted text

--
Andrew (irc:RhodiumToad)

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Gierth (#13)
Re: BUG #15321: XMLTABLE leaks memory like crazy

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

Attached patch is the same logic as before but with more comments.

This looks generally reasonable to me, but what's the point of doing
two MemoryContextReset calls in tfuncFetchRows? Doing one at the
end should be sufficient to guarantee that the context is empty already
at the next call.

Also, I'd be inclined to rename "perValueCxt" to something else,
it's not really correctly named for this usage pattern.

regards, tom lane

#16Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Tom Lane (#15)
Re: BUG #15321: XMLTABLE leaks memory like crazy

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

Tom> Also, I'd be inclined to rename "perValueCxt" to something else,
Tom> it's not really correctly named for this usage pattern.

perCallCxt? or do you have a better name?

--
Andrew (irc:RhodiumToad)

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Gierth (#16)
Re: BUG #15321: XMLTABLE leaks memory like crazy

Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> Also, I'd be inclined to rename "perValueCxt" to something else,
Tom> it's not really correctly named for this usage pattern.

perCallCxt? or do you have a better name?

I was hoping to avoid being pinned down on that ;-). But maybe
fetchRowsCxt or tableBuildCxt or something along that line?
It's not very clear what "perCall" is per call of, so I'm not
in love with that idea.

regards, tom lane

#18Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Tom Lane (#17)
Re: BUG #15321: XMLTABLE leaks memory like crazy

"Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:

perCallCxt? or do you have a better name?

Tom> I was hoping to avoid being pinned down on that ;-).

Well, they do say there are 2 hard problems in computing: finding names
for things, cache invalidation, and off-by-one errors :-)

Tom> But maybe fetchRowsCxt or tableBuildCxt or something along that
Tom> line? It's not very clear what "perCall" is per call of, so I'm
Tom> not in love with that idea.

Meh. perTableCxt?

--
Andrew (irc:RhodiumToad)

#19Andrew Gierth
andrew@tao11.riddles.org.uk
In reply to: Andrew Gierth (#18)
Re: BUG #15321: XMLTABLE leaks memory like crazy

"Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

Tom> But maybe fetchRowsCxt or tableBuildCxt or something along that
Tom> line? It's not very clear what "perCall" is per call of, so I'm
Tom> not in love with that idea.

Andrew> Meh. perTableCxt?

Pushed it with perTableCxt as the name.

--
Andrew (irc:RhodiumToad)

#20Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Andrew Gierth (#19)
Re: BUG #15321: XMLTABLE leaks memory like crazy

On 2018-Aug-13, Andrew Gierth wrote:

"Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:

Tom> But maybe fetchRowsCxt or tableBuildCxt or something along that
Tom> line? It's not very clear what "perCall" is per call of, so I'm
Tom> not in love with that idea.

Andrew> Meh. perTableCxt?

Pushed it with perTableCxt as the name.

Thanks for taking care of this! Much appreciated.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services