Preventing tuple-table leakage in plpgsql

Started by Tom Laneover 12 years ago14 messages
#1Tom Lane
tgl@sss.pgh.pa.us

Bug #8279 exhibits an intra-transaction memory leak in a plpgsql
function that repeatedly traps an error. The cause of the leak is that
the SPITupleTable created during exec_stmt_execsql is never cleaned up.
(It will get cleaned up at function exit, but that's not soon enough in
this usage.)

The submitter proposes fixing this by inserting some more
SPI_freetuptable calls in exec_stmt_execsql, but that's not much of a
fix. The patch only covers the two ereports associated with "strict"
mode (and not, say, any errors thrown in functions called by
exec_stmt_execsql). Nor does it do anything for similar leakage
scenarios elsewhere. I count at least four other functions with the
same kind of oversight in plpgsql, and there are suspicious-looking
usages elsewhere as well.

One approach we could take is to insert PG_TRY blocks to ensure that
SPI_freetuptable is called on error exit from such functions. (plpython
seems to have adopted this solution already.) However, that tends to be
pretty messy notationally, and possibly could represent a noticeable
performance hit depending on what you assume about the speed of
sigsetjmp(). Moreover, fixing known trouble spots this way does nothing
to guard against similar errors elsewhere.

So I'm inclined to propose that SPI itself should offer some mechanism
for cleaning up tuple tables at subtransaction abort. We could just
have it automatically throw away tuple tables made in the current
subtransaction, or we could allow callers to exercise some control,
perhaps by calling a function that says "don't reclaim this tuple table
automatically". I'm not sure if there's any real use-case for such a
call though.

It's also not very clear to me if tuple tables should be thrown away
automatically at subtransaction commit. We could do that, or leave
things alone, or add some logic to emit warning bleats about unreleased
tuple tables (comparable to what is done for many other resource types).
If we change anything about the commit case, I think we run significant
risk of breaking third-party code that works now, so maybe it's best
to leave that alone.

It might also be worth debating whether to back-patch such a fix.
This issue has been there ever since plpgsql grew the ability to trap
errors, so the lack of previous reports might mean that it's not worth
taking risks to fix such leaks in back branches.

Comments, better ideas?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#1)
Re: Preventing tuple-table leakage in plpgsql

On Fri, Jul 05, 2013 at 02:47:06PM -0400, Tom Lane wrote:

Bug #8279 exhibits an intra-transaction memory leak in a plpgsql
function that repeatedly traps an error. The cause of the leak is that
the SPITupleTable created during exec_stmt_execsql is never cleaned up.
(It will get cleaned up at function exit, but that's not soon enough in
this usage.)

One approach we could take is to insert PG_TRY blocks to ensure that
SPI_freetuptable is called on error exit from such functions. (plpython
seems to have adopted this solution already.) However, that tends to be
pretty messy notationally, and possibly could represent a noticeable
performance hit depending on what you assume about the speed of
sigsetjmp(). Moreover, fixing known trouble spots this way does nothing
to guard against similar errors elsewhere.

I, too, find that strategy worth avoiding as long as practical.

So I'm inclined to propose that SPI itself should offer some mechanism
for cleaning up tuple tables at subtransaction abort. We could just
have it automatically throw away tuple tables made in the current
subtransaction, or we could allow callers to exercise some control,
perhaps by calling a function that says "don't reclaim this tuple table
automatically". I'm not sure if there's any real use-case for such a
call though.

I suppose that would be as simple as making spi_dest_startup() put the
tuptabcxt under CurTransactionContext? The function to prevent reclamation
would then just do a MemoryContextSetParent().

Is there good reason to believe that SPI tuptables are the main interesting
PL/pgSQL allocation to make a point of freeing promptly, error or no error? I
wonder if larger sections of pl_exec.c could run under CurTransactionContext.

It's also not very clear to me if tuple tables should be thrown away
automatically at subtransaction commit. We could do that, or leave
things alone, or add some logic to emit warning bleats about unreleased
tuple tables (comparable to what is done for many other resource types).
If we change anything about the commit case, I think we run significant
risk of breaking third-party code that works now, so maybe it's best
to leave that alone.

That's not clear to me, either. The safe thing would be to leave the default
unchanged but expose an API to override the tuptable parent context.
Initially, only PL/pgSQL would use it.

It might also be worth debating whether to back-patch such a fix.
This issue has been there ever since plpgsql grew the ability to trap
errors, so the lack of previous reports might mean that it's not worth
taking risks to fix such leaks in back branches.

I agree that could go either way; let's see what the patch looks like.

Thanks,
nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Chad Wagner
chad.wagner@gmail.com
In reply to: Noah Misch (#2)
Re: Preventing tuple-table leakage in plpgsql

It looks like to me exec_stmt_block creates a subtransaction if the block
has an exception handler by calling BeginInternalSubTransaction. Then
inside the PG_TRY it calls exec_stmts which runs the actual body of the
begin block. If an exception is thrown then I presume we are hitting the
PG_CATCH block where it calls RollbackAndReleaseCurrentSubTransaction.
Inside RollbackAndReleaseCurrentSubTransaction it calls AtEOSubXact_SPI
which frees the procCxt where the tuptabcxt is allocated.

Looking at it seems to suggest that the memory allocated under tuptabcxt
should be freed when we abort the subtransaction? Or did I miss something
here?

BTW, hacking pl_exec and moving the tubtabcxt under CurTransactionContext
does also seem to resolve the memory issue. Which suggests that somewhere
along the way AtEOSubXact_SPI is not called when the subtransaction is
aborted by the catch block, that or I got lost in the code.

On Fri, Jul 5, 2013 at 6:39 PM, Noah Misch <noah@leadboat.com> wrote:

Show quoted text

On Fri, Jul 05, 2013 at 02:47:06PM -0400, Tom Lane wrote:

Bug #8279 exhibits an intra-transaction memory leak in a plpgsql
function that repeatedly traps an error. The cause of the leak is that
the SPITupleTable created during exec_stmt_execsql is never cleaned up.
(It will get cleaned up at function exit, but that's not soon enough in
this usage.)

One approach we could take is to insert PG_TRY blocks to ensure that
SPI_freetuptable is called on error exit from such functions. (plpython
seems to have adopted this solution already.) However, that tends to be
pretty messy notationally, and possibly could represent a noticeable
performance hit depending on what you assume about the speed of
sigsetjmp(). Moreover, fixing known trouble spots this way does nothing
to guard against similar errors elsewhere.

I, too, find that strategy worth avoiding as long as practical.

So I'm inclined to propose that SPI itself should offer some mechanism
for cleaning up tuple tables at subtransaction abort. We could just
have it automatically throw away tuple tables made in the current
subtransaction, or we could allow callers to exercise some control,
perhaps by calling a function that says "don't reclaim this tuple table
automatically". I'm not sure if there's any real use-case for such a
call though.

I suppose that would be as simple as making spi_dest_startup() put the
tuptabcxt under CurTransactionContext? The function to prevent reclamation
would then just do a MemoryContextSetParent().

Is there good reason to believe that SPI tuptables are the main interesting
PL/pgSQL allocation to make a point of freeing promptly, error or no
error? I
wonder if larger sections of pl_exec.c could run under
CurTransactionContext.

It's also not very clear to me if tuple tables should be thrown away
automatically at subtransaction commit. We could do that, or leave
things alone, or add some logic to emit warning bleats about unreleased
tuple tables (comparable to what is done for many other resource types).
If we change anything about the commit case, I think we run significant
risk of breaking third-party code that works now, so maybe it's best
to leave that alone.

That's not clear to me, either. The safe thing would be to leave the
default
unchanged but expose an API to override the tuptable parent context.
Initially, only PL/pgSQL would use it.

It might also be worth debating whether to back-patch such a fix.
This issue has been there ever since plpgsql grew the ability to trap
errors, so the lack of previous reports might mean that it's not worth
taking risks to fix such leaks in back branches.

I agree that could go either way; let's see what the patch looks like.

Thanks,
nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

#4Chad Wagner
chad.wagner@gmail.com
In reply to: Chad Wagner (#3)
Re: Preventing tuple-table leakage in plpgsql

It looks like to me when AtEOSubXact_SPI is called the
_SPI_current->connectSubId is always 1 (since it is only set when
SPI_connect is called, which is only once for plpgsql), but the
CurrentSubTransactionId is incremented each time a subtransaction is
started.

As a result, the memory for procCxt is only freed when I presume the
TopTransaction is aborted or committed.

Should SPI_connect be called again after the subtransaction is created?
And SPI_finish before the subtransaction is committed or aborted?

On Thu, Jul 11, 2013 at 8:46 PM, Chad Wagner <chad.wagner@gmail.com> wrote:

Show quoted text

It looks like to me exec_stmt_block creates a subtransaction if the block
has an exception handler by calling BeginInternalSubTransaction. Then
inside the PG_TRY it calls exec_stmts which runs the actual body of the
begin block. If an exception is thrown then I presume we are hitting the
PG_CATCH block where it calls RollbackAndReleaseCurrentSubTransaction.
Inside RollbackAndReleaseCurrentSubTransaction it calls AtEOSubXact_SPI
which frees the procCxt where the tuptabcxt is allocated.

Looking at it seems to suggest that the memory allocated under tuptabcxt
should be freed when we abort the subtransaction? Or did I miss something
here?

BTW, hacking pl_exec and moving the tubtabcxt under CurTransactionContext
does also seem to resolve the memory issue. Which suggests that somewhere
along the way AtEOSubXact_SPI is not called when the subtransaction is
aborted by the catch block, that or I got lost in the code.

On Fri, Jul 5, 2013 at 6:39 PM, Noah Misch <noah@leadboat.com> wrote:

On Fri, Jul 05, 2013 at 02:47:06PM -0400, Tom Lane wrote:

Bug #8279 exhibits an intra-transaction memory leak in a plpgsql
function that repeatedly traps an error. The cause of the leak is that
the SPITupleTable created during exec_stmt_execsql is never cleaned up.
(It will get cleaned up at function exit, but that's not soon enough in
this usage.)

One approach we could take is to insert PG_TRY blocks to ensure that
SPI_freetuptable is called on error exit from such functions. (plpython
seems to have adopted this solution already.) However, that tends to be
pretty messy notationally, and possibly could represent a noticeable
performance hit depending on what you assume about the speed of
sigsetjmp(). Moreover, fixing known trouble spots this way does nothing
to guard against similar errors elsewhere.

I, too, find that strategy worth avoiding as long as practical.

So I'm inclined to propose that SPI itself should offer some mechanism
for cleaning up tuple tables at subtransaction abort. We could just
have it automatically throw away tuple tables made in the current
subtransaction, or we could allow callers to exercise some control,
perhaps by calling a function that says "don't reclaim this tuple table
automatically". I'm not sure if there's any real use-case for such a
call though.

I suppose that would be as simple as making spi_dest_startup() put the
tuptabcxt under CurTransactionContext? The function to prevent
reclamation
would then just do a MemoryContextSetParent().

Is there good reason to believe that SPI tuptables are the main
interesting
PL/pgSQL allocation to make a point of freeing promptly, error or no
error? I
wonder if larger sections of pl_exec.c could run under
CurTransactionContext.

It's also not very clear to me if tuple tables should be thrown away
automatically at subtransaction commit. We could do that, or leave
things alone, or add some logic to emit warning bleats about unreleased
tuple tables (comparable to what is done for many other resource types).
If we change anything about the commit case, I think we run significant
risk of breaking third-party code that works now, so maybe it's best
to leave that alone.

That's not clear to me, either. The safe thing would be to leave the
default
unchanged but expose an API to override the tuptable parent context.
Initially, only PL/pgSQL would use it.

It might also be worth debating whether to back-patch such a fix.
This issue has been there ever since plpgsql grew the ability to trap
errors, so the lack of previous reports might mean that it's not worth
taking risks to fix such leaks in back branches.

I agree that could go either way; let's see what the patch looks like.

Thanks,
nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#2)
1 attachment(s)
Re: Preventing tuple-table leakage in plpgsql

Noah Misch <noah@leadboat.com> writes:

On Fri, Jul 05, 2013 at 02:47:06PM -0400, Tom Lane wrote:

So I'm inclined to propose that SPI itself should offer some mechanism
for cleaning up tuple tables at subtransaction abort. We could just
have it automatically throw away tuple tables made in the current
subtransaction, or we could allow callers to exercise some control,
perhaps by calling a function that says "don't reclaim this tuple table
automatically". I'm not sure if there's any real use-case for such a
call though.

I suppose that would be as simple as making spi_dest_startup() put the
tuptabcxt under CurTransactionContext? The function to prevent reclamation
would then just do a MemoryContextSetParent().

I experimented with this, and found out that it's not quite that simple.
In a SPI procedure that hasn't created a subtransaction (eg, a plpgsql
function without an exception block), if we attach tuple tables to the
outer transaction's CurTransactionContext then they fail to go away
during SPI_finish(), creating leaks in code that was previously correct.

However, we can use your idea when running inside a subtransaction,
while still attaching the tuple table to the procedure's own procCxt
when no subtransaction is involved. The attached draft patch does it
that way, and successfully resolves the complained-of leakage case.

I like this better than what I originally had in mind, because it
produces no change in semantics in the case where a SPI procedure
doesn't create any subtransactions, which I imagine is at least 99.44%
of third-party SPI-using code.

Also, because the tuple tables don't go away until
CleanupSubTransaction(), code that explicitly frees them will still work
as long as it does that before rolling back the subxact. Thus, the
patch's changes to remove SPI_freetuptable() calls in
plpy_cursorobject.c are not actually necessary for correctness, it's
just that we no longer need them. Ditto for the change in
AtEOSubXact_SPI. Unfortunately, the change in pl_exec.c *is* necessary
to avoid a coredump, because that call was being made after rolling back
the subxact.

All in all, the risk of breaking anything outside core code seems very
small, and I'm inclined to think that we don't need to provide an API
function to reparent a tuple table. But having said that, I'm also
inclined to not back-patch this further than 9.3, just in case.

The patch still requires documentation changes, and there's also one
more error-cleanup SPI_freetuptable() call that ought to go away, in
PLy_spi_execute_fetch_result. But that one needs the fix posited in
/messages/by-id/21017.1374273434@sss.pgh.pa.us
first.

Comments?

regards, tom lane

Attachments:

tuple-table-leak-fix-1.patchtext/x-patch; charset=us-asciiDownload
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index f30b2cd93330d72e0a412d6517aec08c96cc753e..3f4d1e625d738ad0c08d0245a0169451d29a631d 100644
*** a/src/backend/executor/spi.c
--- b/src/backend/executor/spi.c
*************** AtEOSubXact_SPI(bool isCommit, SubTransa
*** 284,291 ****
  	{
  		/* free Executor memory the same as _SPI_end_call would do */
  		MemoryContextResetAndDeleteChildren(_SPI_current->execCxt);
! 		/* throw away any partially created tuple-table */
! 		SPI_freetuptable(_SPI_current->tuptable);
  		_SPI_current->tuptable = NULL;
  	}
  }
--- 284,291 ----
  	{
  		/* free Executor memory the same as _SPI_end_call would do */
  		MemoryContextResetAndDeleteChildren(_SPI_current->execCxt);
! 		/* forget any partially created tuple-table */
! 		/* (we don't need to free it, subxact cleanup will do so) */
  		_SPI_current->tuptable = NULL;
  	}
  }
*************** void
*** 1641,1648 ****
  spi_dest_startup(DestReceiver *self, int operation, TupleDesc typeinfo)
  {
  	SPITupleTable *tuptable;
! 	MemoryContext oldcxt;
  	MemoryContext tuptabcxt;
  
  	/*
  	 * When called by Executor _SPI_curid expected to be equal to
--- 1641,1649 ----
  spi_dest_startup(DestReceiver *self, int operation, TupleDesc typeinfo)
  {
  	SPITupleTable *tuptable;
! 	MemoryContext parentcxt;
  	MemoryContext tuptabcxt;
+ 	MemoryContext oldcxt;
  
  	/*
  	 * When called by Executor _SPI_curid expected to be equal to
*************** spi_dest_startup(DestReceiver *self, int
*** 1656,1669 ****
  	if (_SPI_current->tuptable != NULL)
  		elog(ERROR, "improper call to spi_dest_startup");
  
! 	oldcxt = _SPI_procmem();	/* switch to procedure memory context */
  
! 	tuptabcxt = AllocSetContextCreate(CurrentMemoryContext,
  									  "SPI TupTable",
  									  ALLOCSET_DEFAULT_MINSIZE,
  									  ALLOCSET_DEFAULT_INITSIZE,
  									  ALLOCSET_DEFAULT_MAXSIZE);
! 	MemoryContextSwitchTo(tuptabcxt);
  
  	_SPI_current->tuptable = tuptable = (SPITupleTable *)
  		palloc(sizeof(SPITupleTable));
--- 1657,1681 ----
  	if (_SPI_current->tuptable != NULL)
  		elog(ERROR, "improper call to spi_dest_startup");
  
! 	/*
! 	 * If we're in the same subtransaction our SPI procedure was called in,
! 	 * attach the tuple table context to the procedure's procCxt.  This
! 	 * ensures it'll go away at SPI_finish().  If we're in a subtransaction
! 	 * established within the procedure, attach the tuple table to the
! 	 * subtransaction's CurTransactionContext, so that it'll go away
! 	 * immediately if the subtransaction fails.
! 	 */
! 	if (_SPI_current->connectSubid == GetCurrentSubTransactionId())
! 		parentcxt = _SPI_current->procCxt;
! 	else
! 		parentcxt = CurTransactionContext;
  
! 	tuptabcxt = AllocSetContextCreate(parentcxt,
  									  "SPI TupTable",
  									  ALLOCSET_DEFAULT_MINSIZE,
  									  ALLOCSET_DEFAULT_INITSIZE,
  									  ALLOCSET_DEFAULT_MAXSIZE);
! 	oldcxt = MemoryContextSwitchTo(tuptabcxt);
  
  	_SPI_current->tuptable = tuptable = (SPITupleTable *)
  		palloc(sizeof(SPITupleTable));
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index 5b142e3bee6ccf929f3d4ec8ef3bf6c46467dc90..aa6b0af2e08ba59358ec8bfa7182bae5cef9cb8a 100644
*** a/src/pl/plpgsql/src/pl_exec.c
--- b/src/pl/plpgsql/src/pl_exec.c
*************** exec_stmt_block(PLpgSQL_execstate *estat
*** 1202,1208 ****
  			 */
  			SPI_restore_connection();
  
! 			/* Must clean up the econtext too */
  			exec_eval_cleanup(estate);
  
  			/* Look for a matching exception handler */
--- 1202,1214 ----
  			 */
  			SPI_restore_connection();
  
! 			/*
! 			 * Must clean up the econtext too.	However, any tuple table made
! 			 * in the subxact will have been thrown away by SPI during subxact
! 			 * abort, so we don't need to (and mustn't try to) free the
! 			 * eval_tuptable.
! 			 */
! 			estate->eval_tuptable = NULL;
  			exec_eval_cleanup(estate);
  
  			/* Look for a matching exception handler */
diff --git a/src/pl/plpython/plpy_cursorobject.c b/src/pl/plpython/plpy_cursorobject.c
index 910e63b19954cae697043d1641365de9b480cdc0..2c458d35fdb1a19fad1a46670487d6468f63384b 100644
*** a/src/pl/plpython/plpy_cursorobject.c
--- b/src/pl/plpython/plpy_cursorobject.c
*************** PLy_cursor_iternext(PyObject *self)
*** 377,384 ****
  	}
  	PG_CATCH();
  	{
- 		SPI_freetuptable(SPI_tuptable);
- 
  		PLy_spi_subtransaction_abort(oldcontext, oldowner);
  		return NULL;
  	}
--- 377,382 ----
*************** PLy_cursor_fetch(PyObject *self, PyObjec
*** 461,468 ****
  	}
  	PG_CATCH();
  	{
- 		SPI_freetuptable(SPI_tuptable);
- 
  		PLy_spi_subtransaction_abort(oldcontext, oldowner);
  		return NULL;
  	}
--- 459,464 ----
#6Noah Misch
noah@leadboat.com
In reply to: Chad Wagner (#4)
Re: Preventing tuple-table leakage in plpgsql

On Thu, Jul 11, 2013 at 09:14:38PM -0400, Chad Wagner wrote:

It looks like to me when AtEOSubXact_SPI is called the
_SPI_current->connectSubId is always 1 (since it is only set when
SPI_connect is called, which is only once for plpgsql), but the
CurrentSubTransactionId is incremented each time a subtransaction is
started.

Right. AtEOSubXact_SPI() cleans up any SPI connections originating in the
ending subtransaction. It leaves alone connections from higher subtransaction
levels; SPI has no general expectation that those have lost relevance.

As a result, the memory for procCxt is only freed when I presume the
TopTransaction is aborted or committed.

In your code from bug #8279, I expect it to be freed when the DO block exits.
The backend might not actually shrink then, but repeated calls to a similar DO
block within the same transaction should not cause successive increases in the
process's memory footprint.

Should SPI_connect be called again after the subtransaction is created? And
SPI_finish before the subtransaction is committed or aborted?

Hmm. An SPI_push()+SPI_connect() every time PL/pgSQL starts a subtransaction
would be another way to fix it, yes.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#6)
Re: Preventing tuple-table leakage in plpgsql

Noah Misch <noah@leadboat.com> writes:

On Thu, Jul 11, 2013 at 09:14:38PM -0400, Chad Wagner wrote:

Should SPI_connect be called again after the subtransaction is created? And
SPI_finish before the subtransaction is committed or aborted?

Hmm. An SPI_push()+SPI_connect() every time PL/pgSQL starts a subtransaction
would be another way to fix it, yes.

That sounds like a dangerous idea to me. The procedure would then be
working actively with queries from two different SPI levels, which I'm
pretty sure would cause issues. It's possible that plpgsql's SPI access
is sufficiently lexically-local that statements within the BEGIN block
couldn't use any SPI resources created by statements outside it nor vice
versa. But then again maybe not, and in any case we couldn't imagine
that that would be a workable restriction for non-plpgsql scenarios.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#5)
Re: Preventing tuple-table leakage in plpgsql

On Fri, Jul 19, 2013 at 07:34:14PM -0400, Tom Lane wrote:

Noah Misch <noah@leadboat.com> writes:

On Fri, Jul 05, 2013 at 02:47:06PM -0400, Tom Lane wrote:

So I'm inclined to propose that SPI itself should offer some mechanism
for cleaning up tuple tables at subtransaction abort. We could just
have it automatically throw away tuple tables made in the current
subtransaction, or we could allow callers to exercise some control,
perhaps by calling a function that says "don't reclaim this tuple table
automatically". I'm not sure if there's any real use-case for such a
call though.

I suppose that would be as simple as making spi_dest_startup() put the
tuptabcxt under CurTransactionContext? The function to prevent reclamation
would then just do a MemoryContextSetParent().

I experimented with this, and found out that it's not quite that simple.
In a SPI procedure that hasn't created a subtransaction (eg, a plpgsql
function without an exception block), if we attach tuple tables to the
outer transaction's CurTransactionContext then they fail to go away
during SPI_finish(), creating leaks in code that was previously correct.

However, we can use your idea when running inside a subtransaction,
while still attaching the tuple table to the procedure's own procCxt
when no subtransaction is involved. The attached draft patch does it
that way, and successfully resolves the complained-of leakage case.

I like this better than what I originally had in mind, because it
produces no change in semantics in the case where a SPI procedure
doesn't create any subtransactions, which I imagine is at least 99.44%
of third-party SPI-using code.

Reasonable enough. Code that does use subtransactions will need to be more
careful than before to manually free tuple tables in the non-error case.
Failure to do so has been creating a leak that lasts until SPI_finish(), but
it will now be able to cause a transaction-lifespan leak.

patch's changes to remove SPI_freetuptable() calls in
plpy_cursorobject.c are not actually necessary for correctness, it's
just that we no longer need them.

If PLy_spi_subtransaction_commit() were to throw an error (granted, unlikely),
would we not reference freed memory at those code sites as they stand today?

Unfortunately, the change in pl_exec.c *is* necessary
to avoid a coredump, because that call was being made after rolling back
the subxact.

Brief search for similar patterns in external PLs:

plr - no subtransaction use
plv8 - no SPI_freetuptable()
plphp - uses both, but usage looks compatible
pljava - calls "SPI_freetuptable(SPI_tuptable)", but never a tuptable pointer
it stored away. Should be compatible, then.

All in all, the risk of breaking anything outside core code seems very
small, and I'm inclined to think that we don't need to provide an API
function to reparent a tuple table. But having said that, I'm also
inclined to not back-patch this further than 9.3, just in case.

I wouldn't be confident in back-patching further than that. It's not hard to
imagine a non-core PL needing a compensating change like the one you made to
PL/pgSQL.

Thanks,
nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#8)
Re: Preventing tuple-table leakage in plpgsql

Noah Misch <noah@leadboat.com> writes:

On Fri, Jul 19, 2013 at 07:34:14PM -0400, Tom Lane wrote:

However, we can use your idea when running inside a subtransaction,
while still attaching the tuple table to the procedure's own procCxt
when no subtransaction is involved. The attached draft patch does it
that way, and successfully resolves the complained-of leakage case.

I like this better than what I originally had in mind, because it
produces no change in semantics in the case where a SPI procedure
doesn't create any subtransactions, which I imagine is at least 99.44%
of third-party SPI-using code.

Reasonable enough. Code that does use subtransactions will need to be more
careful than before to manually free tuple tables in the non-error case.
Failure to do so has been creating a leak that lasts until SPI_finish(), but
it will now be able to cause a transaction-lifespan leak.

Hmm ... good point. The other plan I'd been considering was to add
explicit tracking inside spi.c of all tuple tables created within the
current procedure, and then have AtEOSubXact_SPI flush any that were
created inside a failed subxact. The tables would still be children of
the procCxt and thus could not be leaked past SPI_finish. When you
suggested attaching to subtransaction contexts I thought that would let
us get away without this additional bookkeeping logic, but maybe we
should bite the bullet and add the extra logic. A change that's meant
to remove leak risks really shouldn't be introducing other, new leak
risks. (An additional advantage is we could detect attempts to free
the same tuptable more than once, which would be a good thing ...)

patch's changes to remove SPI_freetuptable() calls in
plpy_cursorobject.c are not actually necessary for correctness, it's
just that we no longer need them.

If PLy_spi_subtransaction_commit() were to throw an error (granted, unlikely),
would we not reference freed memory at those code sites as they stand today?

Hm, possibly, depending on just when the error was thrown.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#9)
Re: Preventing tuple-table leakage in plpgsql

On Sun, Jul 21, 2013 at 12:40:38PM -0400, Tom Lane wrote:

Noah Misch <noah@leadboat.com> writes:

Reasonable enough. Code that does use subtransactions will need to be more
careful than before to manually free tuple tables in the non-error case.
Failure to do so has been creating a leak that lasts until SPI_finish(), but
it will now be able to cause a transaction-lifespan leak.

Hmm ... good point. The other plan I'd been considering was to add
explicit tracking inside spi.c of all tuple tables created within the
current procedure, and then have AtEOSubXact_SPI flush any that were
created inside a failed subxact. The tables would still be children of
the procCxt and thus could not be leaked past SPI_finish. When you
suggested attaching to subtransaction contexts I thought that would let
us get away without this additional bookkeeping logic, but maybe we
should bite the bullet and add the extra logic.

Is there reason to believe we wouldn't eventually find a half dozen other
allocations calling for similar bespoke treatment? Does something make tuple
tables special among memory allocations, or are they just the garden-variety
allocation that happens to bother the test case at hand?

A change that's meant
to remove leak risks really shouldn't be introducing other, new leak
risks.

Yes; that gives one pause.

Thanks,
nm

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Noah Misch (#10)
Re: Preventing tuple-table leakage in plpgsql

Noah Misch <noah@leadboat.com> writes:

On Sun, Jul 21, 2013 at 12:40:38PM -0400, Tom Lane wrote:

Hmm ... good point. The other plan I'd been considering was to add
explicit tracking inside spi.c of all tuple tables created within the
current procedure, and then have AtEOSubXact_SPI flush any that were
created inside a failed subxact.

Is there reason to believe we wouldn't eventually find a half dozen other
allocations calling for similar bespoke treatment? Does something make tuple
tables special among memory allocations, or are they just the garden-variety
allocation that happens to bother the test case at hand?

It's hard to speculate about the memory management habits of third-party
SPI-using code. But in plpgsql, the convention is that random bits of
memory should be allocated in a short-term context separate from the SPI
procCxt --- typically, the estate->eval_econtext expression context,
which exec_stmt_block already takes care to clean up when catching an
exception. So the problem is that that doesn't work for tuple tables,
which have procCxt lifespan. The fact that they tend to be big (at
least 8K apiece) compounds the issue.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Noah Misch
noah@leadboat.com
In reply to: Tom Lane (#11)
Re: Preventing tuple-table leakage in plpgsql

On Mon, Jul 22, 2013 at 10:02:30PM -0400, Tom Lane wrote:

Noah Misch <noah@leadboat.com> writes:

On Sun, Jul 21, 2013 at 12:40:38PM -0400, Tom Lane wrote:

Hmm ... good point. The other plan I'd been considering was to add
explicit tracking inside spi.c of all tuple tables created within the
current procedure, and then have AtEOSubXact_SPI flush any that were
created inside a failed subxact.

Is there reason to believe we wouldn't eventually find a half dozen other
allocations calling for similar bespoke treatment? Does something make tuple
tables special among memory allocations, or are they just the garden-variety
allocation that happens to bother the test case at hand?

It's hard to speculate about the memory management habits of third-party
SPI-using code. But in plpgsql, the convention is that random bits of
memory should be allocated in a short-term context separate from the SPI
procCxt --- typically, the estate->eval_econtext expression context,
which exec_stmt_block already takes care to clean up when catching an
exception. So the problem is that that doesn't work for tuple tables,
which have procCxt lifespan. The fact that they tend to be big (at
least 8K apiece) compounds the issue.

Reasonable to treat them specially, per your plan, then.

--
Noah Misch
EnterpriseDB http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#9)
1 attachment(s)
Re: Preventing tuple-table leakage in plpgsql

I wrote:

Hmm ... good point. The other plan I'd been considering was to add
explicit tracking inside spi.c of all tuple tables created within the
current procedure, and then have AtEOSubXact_SPI flush any that were
created inside a failed subxact. The tables would still be children of
the procCxt and thus could not be leaked past SPI_finish. When you
suggested attaching to subtransaction contexts I thought that would let
us get away without this additional bookkeeping logic, but maybe we
should bite the bullet and add the extra logic. A change that's meant
to remove leak risks really shouldn't be introducing other, new leak
risks. (An additional advantage is we could detect attempts to free
the same tuptable more than once, which would be a good thing ...)

Here's a draft cut at that. I've checked that this fixes the reported
memory leak. This uses ilist.h, so it couldn't easily be backported
to before 9.3, but we weren't going to do that anyway.

Worth noting is that SPI_freetuptable() is now much more picky about what
it's being passed: it won't free anything that is not a tuptable of the
currently connected SPI procedure. This doesn't appear to be a problem
for anything in core or contrib, but I wonder whether anyone thinks we
need to relax that? If so, we could allow it to search down the SPI
context stack, but I'm not sure I see a use-case for allowing an inner
SPI procedure to free a tuple table made by an outer one. In general,
I think this version of SPI_freetuptable() is a lot safer than what
we had before, and possibly likely to find bugs in calling code.

Another point worth making is that this version of the patch deletes the
tuple tables during AtEOSubXact_SPI(), earlier in cleanup than would
happen with the prior version. That increases the risk that external
code might try to delete an already-deleted tuple table, if it tries
to call SPI_freetuptable() during subxact cleanup. The new code won't
crash, although come to think of it it will probably throw an error
because you're not connected anymore. (Maybe this is a reason to not
insist on being connected, but just silently search whatever the top
stack context is?)

This still lacks doc changes, too.

regards, tom lane

Attachments:

tuple-table-leak-fix-2.patchtext/x-patch; charset=us-asciiDownload
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index f30b2cd93330d72e0a412d6517aec08c96cc753e..f179922dd4bd4eb36212ebd91413ba28de44b2fc 100644
*** a/src/backend/executor/spi.c
--- b/src/backend/executor/spi.c
*************** SPI_connect(void)
*** 126,131 ****
--- 126,132 ----
  	_SPI_current->processed = 0;
  	_SPI_current->lastoid = InvalidOid;
  	_SPI_current->tuptable = NULL;
+ 	slist_init(&_SPI_current->tuptables);
  	_SPI_current->procCxt = NULL;		/* in case we fail to create 'em */
  	_SPI_current->execCxt = NULL;
  	_SPI_current->connectSubid = GetCurrentSubTransactionId();
*************** SPI_finish(void)
*** 166,172 ****
  	/* Restore memory context as it was before procedure call */
  	MemoryContextSwitchTo(_SPI_current->savedcxt);
  
! 	/* Release memory used in procedure call */
  	MemoryContextDelete(_SPI_current->execCxt);
  	_SPI_current->execCxt = NULL;
  	MemoryContextDelete(_SPI_current->procCxt);
--- 167,173 ----
  	/* Restore memory context as it was before procedure call */
  	MemoryContextSwitchTo(_SPI_current->savedcxt);
  
! 	/* Release memory used in procedure call (including tuptables) */
  	MemoryContextDelete(_SPI_current->execCxt);
  	_SPI_current->execCxt = NULL;
  	MemoryContextDelete(_SPI_current->procCxt);
*************** AtEOSubXact_SPI(bool isCommit, SubTransa
*** 282,292 ****
  	 */
  	if (_SPI_current && !isCommit)
  	{
  		/* free Executor memory the same as _SPI_end_call would do */
  		MemoryContextResetAndDeleteChildren(_SPI_current->execCxt);
! 		/* throw away any partially created tuple-table */
! 		SPI_freetuptable(_SPI_current->tuptable);
! 		_SPI_current->tuptable = NULL;
  	}
  }
  
--- 283,316 ----
  	 */
  	if (_SPI_current && !isCommit)
  	{
+ 		slist_mutable_iter siter;
+ 
  		/* free Executor memory the same as _SPI_end_call would do */
  		MemoryContextResetAndDeleteChildren(_SPI_current->execCxt);
! 
! 		/* throw away any tuple tables created within current subxact */
! 		slist_foreach_modify(siter, &_SPI_current->tuptables)
! 		{
! 			SPITupleTable *tuptable;
! 
! 			tuptable = slist_container(SPITupleTable, next, siter.cur);
! 			if (tuptable->subid >= mySubid)
! 			{
! 				/*
! 				 * If we used SPI_freetuptable() here, its internal search of
! 				 * the tuptables list would make this operation O(N^2).
! 				 * Instead, just free the tuptable manually.
! 				 */
! 				slist_delete_current(&siter);
! 				if (tuptable == _SPI_current->tuptable)
! 					_SPI_current->tuptable = NULL;
! 				if (tuptable == SPI_tuptable)
! 					SPI_tuptable = NULL;
! 				MemoryContextDelete(tuptable->tuptabcxt);
! 			}
! 		}
! 		/* in particular we should have gotten rid of any in-progress table */
! 		Assert(_SPI_current->tuptable == NULL);
  	}
  }
  
*************** SPI_freetuple(HeapTuple tuple)
*** 1021,1028 ****
  void
  SPI_freetuptable(SPITupleTable *tuptable)
  {
! 	if (tuptable != NULL)
! 		MemoryContextDelete(tuptable->tuptabcxt);
  }
  
  
--- 1045,1089 ----
  void
  SPI_freetuptable(SPITupleTable *tuptable)
  {
! 	slist_mutable_iter siter;
! 	bool		found = false;
! 
! 	/* ignore call if NULL pointer */
! 	if (tuptable == NULL)
! 		return;
! 
! 	/* must be connected */
! 	if (_SPI_curid + 1 != _SPI_connected)
! 		elog(ERROR, "improper call to SPI_freetuptable");
! 	if (_SPI_current != &(_SPI_stack[_SPI_curid + 1]))
! 		elog(ERROR, "SPI stack corrupted");
! 
! 	/* ensure tuptable is active, then remove it from list */
! 	slist_foreach_modify(siter, &_SPI_current->tuptables)
! 	{
! 		SPITupleTable *tt;
! 
! 		tt = slist_container(SPITupleTable, next, siter.cur);
! 		if (tt == tuptable)
! 		{
! 			slist_delete_current(&siter);
! 			found = true;
! 			break;
! 		}
! 	}
! 	if (!found)
! 	{
! 		elog(WARNING, "attempt to delete invalid SPITupleTable %p", tuptable);
! 		return;
! 	}
! 
! 	/* for safety, reset global variables that might point at tuptable */
! 	if (tuptable == _SPI_current->tuptable)
! 		_SPI_current->tuptable = NULL;
! 	if (tuptable == SPI_tuptable)
! 		SPI_tuptable = NULL;
! 	/* release all memory belonging to tuptable */
! 	MemoryContextDelete(tuptable->tuptabcxt);
  }
  
  
*************** spi_dest_startup(DestReceiver *self, int
*** 1656,1661 ****
--- 1717,1724 ----
  	if (_SPI_current->tuptable != NULL)
  		elog(ERROR, "improper call to spi_dest_startup");
  
+ 	/* We create the tuple table context as a child of procCxt */
+ 
  	oldcxt = _SPI_procmem();	/* switch to procedure memory context */
  
  	tuptabcxt = AllocSetContextCreate(CurrentMemoryContext,
*************** spi_dest_startup(DestReceiver *self, int
*** 1666,1673 ****
  	MemoryContextSwitchTo(tuptabcxt);
  
  	_SPI_current->tuptable = tuptable = (SPITupleTable *)
! 		palloc(sizeof(SPITupleTable));
  	tuptable->tuptabcxt = tuptabcxt;
  	tuptable->alloced = tuptable->free = 128;
  	tuptable->vals = (HeapTuple *) palloc(tuptable->alloced * sizeof(HeapTuple));
  	tuptable->tupdesc = CreateTupleDescCopy(typeinfo);
--- 1729,1746 ----
  	MemoryContextSwitchTo(tuptabcxt);
  
  	_SPI_current->tuptable = tuptable = (SPITupleTable *)
! 		palloc0(sizeof(SPITupleTable));
  	tuptable->tuptabcxt = tuptabcxt;
+ 	tuptable->subid = GetCurrentSubTransactionId();
+ 
+ 	/*
+ 	 * The tuptable is now valid enough to be freed by AtEOSubXact_SPI, so put
+ 	 * it onto the SPI context's tuptables list.  This will ensure it's not
+ 	 * leaked even in the unlikely event the following few lines fail.
+ 	 */
+ 	slist_push_head(&_SPI_current->tuptables, &tuptable->next);
+ 
+ 	/* set up initial allocations */
  	tuptable->alloced = tuptable->free = 128;
  	tuptable->vals = (HeapTuple *) palloc(tuptable->alloced * sizeof(HeapTuple));
  	tuptable->tupdesc = CreateTupleDescCopy(typeinfo);
diff --git a/src/include/executor/spi.h b/src/include/executor/spi.h
index d4f1272cd81ae13eb52a16327357bfdf15ad7621..81310e377f6b769c5aa306cf9aab932813dbeef8 100644
*** a/src/include/executor/spi.h
--- b/src/include/executor/spi.h
***************
*** 13,18 ****
--- 13,19 ----
  #ifndef SPI_H
  #define SPI_H
  
+ #include "lib/ilist.h"
  #include "nodes/parsenodes.h"
  #include "utils/portal.h"
  
*************** typedef struct SPITupleTable
*** 24,29 ****
--- 25,32 ----
  	uint32		free;			/* # of free vals */
  	TupleDesc	tupdesc;		/* tuple descriptor */
  	HeapTuple  *vals;			/* tuples */
+ 	slist_node	next;			/* link for internal bookkeeping */
+ 	SubTransactionId subid;		/* subxact in which tuptable was created */
  } SPITupleTable;
  
  /* Plans are opaque structs for standard users of SPI */
diff --git a/src/include/executor/spi_priv.h b/src/include/executor/spi_priv.h
index ef7903abd09a94906d48bfcbf3ea223aacdd9423..7d4c9e963933401bf278363bf51a9db63c78086f 100644
*** a/src/include/executor/spi_priv.h
--- b/src/include/executor/spi_priv.h
*************** typedef struct
*** 23,30 ****
  	/* current results */
  	uint32		processed;		/* by Executor */
  	Oid			lastoid;
! 	SPITupleTable *tuptable;
  
  	MemoryContext procCxt;		/* procedure context */
  	MemoryContext execCxt;		/* executor context */
  	MemoryContext savedcxt;		/* context of SPI_connect's caller */
--- 23,32 ----
  	/* current results */
  	uint32		processed;		/* by Executor */
  	Oid			lastoid;
! 	SPITupleTable *tuptable;	/* tuptable currently being built */
  
+ 	/* resources of this execution context */
+ 	slist_head	tuptables;		/* list of all live SPITupleTables */
  	MemoryContext procCxt;		/* procedure context */
  	MemoryContext execCxt;		/* executor context */
  	MemoryContext savedcxt;		/* context of SPI_connect's caller */
diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index 5b142e3bee6ccf929f3d4ec8ef3bf6c46467dc90..aa6b0af2e08ba59358ec8bfa7182bae5cef9cb8a 100644
*** a/src/pl/plpgsql/src/pl_exec.c
--- b/src/pl/plpgsql/src/pl_exec.c
*************** exec_stmt_block(PLpgSQL_execstate *estat
*** 1202,1208 ****
  			 */
  			SPI_restore_connection();
  
! 			/* Must clean up the econtext too */
  			exec_eval_cleanup(estate);
  
  			/* Look for a matching exception handler */
--- 1202,1214 ----
  			 */
  			SPI_restore_connection();
  
! 			/*
! 			 * Must clean up the econtext too.	However, any tuple table made
! 			 * in the subxact will have been thrown away by SPI during subxact
! 			 * abort, so we don't need to (and mustn't try to) free the
! 			 * eval_tuptable.
! 			 */
! 			estate->eval_tuptable = NULL;
  			exec_eval_cleanup(estate);
  
  			/* Look for a matching exception handler */
diff --git a/src/pl/plpython/plpy_cursorobject.c b/src/pl/plpython/plpy_cursorobject.c
index 910e63b19954cae697043d1641365de9b480cdc0..2c458d35fdb1a19fad1a46670487d6468f63384b 100644
*** a/src/pl/plpython/plpy_cursorobject.c
--- b/src/pl/plpython/plpy_cursorobject.c
*************** PLy_cursor_iternext(PyObject *self)
*** 377,384 ****
  	}
  	PG_CATCH();
  	{
- 		SPI_freetuptable(SPI_tuptable);
- 
  		PLy_spi_subtransaction_abort(oldcontext, oldowner);
  		return NULL;
  	}
--- 377,382 ----
*************** PLy_cursor_fetch(PyObject *self, PyObjec
*** 461,468 ****
  	}
  	PG_CATCH();
  	{
- 		SPI_freetuptable(SPI_tuptable);
- 
  		PLy_spi_subtransaction_abort(oldcontext, oldowner);
  		return NULL;
  	}
--- 459,464 ----
diff --git a/src/pl/plpython/plpy_spi.c b/src/pl/plpython/plpy_spi.c
index ed1f21cd6a51a83bd57157f16af003654b47b46c..982bf84e0e544661f41d98ca7a6eeeddda288036 100644
*** a/src/pl/plpython/plpy_spi.c
--- b/src/pl/plpython/plpy_spi.c
*************** PLy_spi_execute_fetch_result(SPITupleTab
*** 439,445 ****
  		{
  			MemoryContextSwitchTo(oldcontext);
  			PLy_typeinfo_dealloc(&args);
- 			SPI_freetuptable(tuptable);
  			Py_DECREF(result);
  			PG_RE_THROW();
  		}
--- 439,444 ----
#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#13)
Re: Preventing tuple-table leakage in plpgsql

I wrote:

Another point worth making is that this version of the patch deletes the
tuple tables during AtEOSubXact_SPI(), earlier in cleanup than would
happen with the prior version. That increases the risk that external
code might try to delete an already-deleted tuple table, if it tries
to call SPI_freetuptable() during subxact cleanup. The new code won't
crash, although come to think of it it will probably throw an error
because you're not connected anymore. (Maybe this is a reason to not
insist on being connected, but just silently search whatever the top
stack context is?)

After further reflection I think that's the prudent way to do it, so
I've adjusted SPI_freetuptable to not insist on being connected.
Pushed with that change:
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=3d13623d75d3206c8f009353415043a191ebab39

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers