Recursive calls to functions that return sets
Imagine the following scenario:
Function 'A' returns SETOF 'x'. It will issue a query using SPI that
calls function 'B'. This function returns SETOF 'y'.
Each tuple of 'x' is formed from some data in 'y'.
There will be millions of tuples so building a set of 'y' in memory is
not an option.
What would the recommended use of MemoryContexts in an SRF function be
in order to make this work? The SPI_connect must be issued during the
SRF_IS_FIRST_CALL() phase. The Portal that it creates must remain alive
until it's time for the SRF_RETURN_DONE(). What would the recommended
approach be to accomplish this efficiently (and without introducing a
major memory leak)?
The problem I'm trying to solve is a generic one. It's very possible
that the recursion is is of arbitrary depth.
Regards,
Thomas Hallgren
On Wed, Mar 22, 2006 at 03:31:59PM +0100, Thomas Hallgren wrote:
Imagine the following scenario:
Function 'A' returns SETOF 'x'. It will issue a query using SPI that
calls function 'B'. This function returns SETOF 'y'.
Each tuple of 'x' is formed from some data in 'y'.
There will be millions of tuples so building a set of 'y' in memory is
not an option.
I think you're running into a small limitation of set functions here.
If you look at nodeFunctionScan.c that handles this, you can see that
the code is written in such a way as to collect all the tuples first
before returning anything. Not sure why it does that, probably to
handle mark/restore, though that isn't stated anywhere in the code.
What would the recommended use of MemoryContexts in an SRF function be
in order to make this work? The SPI_connect must be issued during the
SRF_IS_FIRST_CALL() phase. The Portal that it creates must remain alive
until it's time for the SRF_RETURN_DONE(). What would the recommended
approach be to accomplish this efficiently (and without introducing a
major memory leak)?
Well, I think this is done the normal way. The function returning
values allocates them in it's own context and does a RETURN NEXT. Once
it has returned them it can free it, or reset the context if it
prefers. The caller is always responsible for copying (since it isn't
often needed).
Have you read the executor/README ?
Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.
Martijn van Oosterhout <kleptog@svana.org> writes:
I think you're running into a small limitation of set functions here.
If you look at nodeFunctionScan.c that handles this, you can see that
the code is written in such a way as to collect all the tuples first
before returning anything.
I don't think Thomas intended to go through nodeFunctionScan, so this
needn't apply to him.
Not sure why it does that,
plpgsql and similar languages will return a tuplestore anyway, so it has
to handle that case, and it was convenient to make all the cases look
alike for starters. Nobody's yet gone back to improve it for the case
of languages that return a tuple per call.
regards, tom lane
Tom Lane wrote:
plpgsql and similar languages will return a tuplestore anyway, so it has
to handle that case, and it was convenient to make all the cases look
alike for starters. Nobody's yet gone back to improve it for the case
of languages that return a tuple per call.
This would be hard to do in the plperl case, at least, and I would be
surprised if it weren't in most others too. So what plperl does is to
fetch the whole set on the first call and then fudges all the other
calls to get the next element from the result set. We save out the
intermediate tuple store on each call and restore it afterwards, so I
think recursion shouldn't be a difficulty.
cheers
andrew
Recursive calls works in PL/Java. No problem there. But the larger the
set, the more memory it consumes. Do I read your answers correctly if I
conclude this is a known limitation when SPI is used? I.e. there's no
way to stream one row at a time without ever building the full set?
Regards,
Thomas Hallgren
Andrew Dunstan wrote:
Show quoted text
Tom Lane wrote:
plpgsql and similar languages will return a tuplestore anyway, so it has
to handle that case, and it was convenient to make all the cases look
alike for starters. Nobody's yet gone back to improve it for the case
of languages that return a tuple per call.This would be hard to do in the plperl case, at least, and I would be
surprised if it weren't in most others too. So what plperl does is to
fetch the whole set on the first call and then fudges all the other
calls to get the next element from the result set. We save out the
intermediate tuple store on each call and restore it afterwards, so I
think recursion shouldn't be a difficulty.cheers
andrew
Thomas Hallgren wrote:
Recursive calls works in PL/Java. No problem there. But the larger the
set, the more memory it consumes. Do I read your answers correctly if I
conclude this is a known limitation when SPI is used? I.e. there's no
way to stream one row at a time without ever building the full set?
Hmm, are you using a tuplestore? The PL/php code for return_next looks
like this:
ZEND_FUNCTION(return_next)
{
... some stuff ...
/* Use the per-query context so that the tuplestore survives */
oldcxt = MemoryContextSwitchTo(rsi->econtext->ecxt_per_query_memory);
/* Form the tuple */
tup = plphp_srf_htup_from_zval(param, current_attinmeta, current_memcxt);
/* First call? Create the tuplestore. */
if (!current_tuplestore)
current_tuplestore = tuplestore_begin_heap(true, false, work_mem);
/* Save the tuple and clean up */
tuplestore_puttuple(current_tuplestore, tup);
heap_freetuple(tup);
MemoryContextSwitchTo(oldcxt);
}
--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Thomas Hallgren wrote:
Recursive calls works in PL/Java. No problem there. But the larger the
set, the more memory it consumes. Do I read your answers correctly if
I conclude this is a known limitation when SPI is used? I.e. there's
no way to stream one row at a time without ever building the full set?
plperl stashes the results in a tuplestore object, which spills to disk.
So memory use is not unbounded. Before 8.1 we had no return_next and no
intermediate tuplestore, so we had serious memory problems with
returning large sets.
As for SPI calls, we also had problems there but now we provide a cursor
interface that works much more nicely.
cheers
andrew
Thomas Hallgren <thomas@tada.se> writes:
Recursive calls works in PL/Java. No problem there. But the larger the
set, the more memory it consumes. Do I read your answers correctly if I
conclude this is a known limitation when SPI is used? I.e. there's no
way to stream one row at a time without ever building the full set?
By no means. The point is that there are some callers of SRFs that are
going to materialize the result set, as well as some SRFs that are going
to hand back a materialized result set anyway. The interface can handle
a tuple-per-call but that's not the way everybody chooses to use it.
regards, tom lane
Tom Lane wrote:
Thomas Hallgren <thomas@tada.se> writes:
Recursive calls works in PL/Java. No problem there. But the larger the
set, the more memory it consumes. Do I read your answers correctly if I
conclude this is a known limitation when SPI is used? I.e. there's no
way to stream one row at a time without ever building the full set?By no means. The point is that there are some callers of SRFs that are
going to materialize the result set, as well as some SRFs that are going
to hand back a materialized result set anyway. The interface can handle
a tuple-per-call but that's not the way everybody chooses to use it.
OK. I've managed to get rid of my last memory-leak (i hope). I followed
Martijn's suggestion to create the returned tuple in my own context. Now
even the nastiest recursive chains using huge sets of data seems to
behave ok :-)
There's one thing that's still a bit fuzzy to me. If I don't use SPI,
the context that is current when my SRF function is called seems to be
reset between each call. I can palloc stuff in it as much as I like. I
can even create the tuple that I return using this context. No memory
leak. But if I, during the SPI_IS_FIRST_CALL phase, do an SPI_connect
(done when the 'multi_call_memory_ctx' is current), then the leak seem
to occur immediately. Will that connect somehow alter the durability for
the context that is current on each call to my SRF?
Regards,
Thomas Hallgren
On Wed, Mar 22, 2006 at 09:09:34PM +0100, Thomas Hallgren wrote:
There's one thing that's still a bit fuzzy to me. If I don't use SPI,
the context that is current when my SRF function is called seems to be
reset between each call. I can palloc stuff in it as much as I like. I
can even create the tuple that I return using this context. No memory
leak. But if I, during the SPI_IS_FIRST_CALL phase, do an SPI_connect
(done when the 'multi_call_memory_ctx' is current), then the leak seem
to occur immediately. Will that connect somehow alter the durability for
the context that is current on each call to my SRF?
Ok, I'm not sure if I understand the reasoning but I think it's like
this:
- When the results of an SRF are accumulated by
ExecMakeTableFunctionResult, that function is reseting your context
each time.
- When you call SPI_connect it creates a new context and switches to
it. It switches back on SPI_finish. SPI_finish switches to the context
active at SPI_connect, maybe this is not what you expect?
The ExecMakeTableFunctionResult only resets the one context, the one
provided when your function starts, anything created in other contexts
is Somebody Else's Problem. So the question, which context are you
allocating in?
Hope this clarifies it,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
tool for doing 5% of the work and then sitting around waiting for someone
else to do the other 95% so you can sue them.
Thomas Hallgren <thomas@tada.se> writes:
But if I, during the SPI_IS_FIRST_CALL phase, do an SPI_connect
(done when the 'multi_call_memory_ctx' is current), then the leak seem
to occur immediately. Will that connect somehow alter the durability for
the context that is current on each call to my SRF?
Are you remembering to SPI_finish when you're done? Maybe what you are
leaking is the SPI state information.
You could look at the MemoryContextStats printout for clues --- easiest
way is to deliberately run the backend out of memory, and after the "out
of memory" error occurs, look in the postmaster log.
regards, tom lane