heap_lock_updated_tuple_rec can leak a buffer refcount

Started by Amit Kapilaalmost 8 years ago5 messages
#1Amit Kapila
amit.kapila16@gmail.com
1 attachment(s)

It seems to me that heap_lock_updated_tuple_rec can lead to a buffer
refcount leak while locking an updated tuple by an aborted
transaction. In commit - 5c609a74, we have added the code to deal
with aborted transactions as below:

heap_lock_updated_tuple_rec()
{
..

if (PageIsAllVisible(BufferGetPage(buf)))
visibilitymap_pin(rel, block, &vmbuffer);
else
vmbuffer = InvalidBuffer;

LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
..
-------------------------- below code is added by commit -5c609a74 -----------
if (TransactionIdDidAbort(HeapTupleHeaderGetXmin(mytup.t_data)))
{
UnlockReleaseBuffer(buf);
return HeapTupleMayBeUpdated;
}
-------------------------------------------------------------

I think the above code forgets to deal with vmbuffer and can lead to a
leak of the same. Attached patch ensures that it deals with vmbuffer
when required.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

fix_failure_cond_tup_version_locking_v1.patchapplication/octet-stream; name=fix_failure_cond_tup_version_locking_v1.patchDownload
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 8a846e7..fac2aec 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -5749,8 +5749,8 @@ l4:
 		 */
 		if (TransactionIdDidAbort(HeapTupleHeaderGetXmin(mytup.t_data)))
 		{
-			UnlockReleaseBuffer(buf);
-			return HeapTupleMayBeUpdated;
+			result = HeapTupleMayBeUpdated;
+			goto out_locked;
 		}
 
 		old_infomask = mytup.t_data->t_infomask;
#2Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#1)
Re: heap_lock_updated_tuple_rec can leak a buffer refcount

On Tue, Feb 13, 2018 at 10:11 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

It seems to me that heap_lock_updated_tuple_rec can lead to a buffer
refcount leak while locking an updated tuple by an aborted
transaction. In commit - 5c609a74, we have added the code to deal
with aborted transactions as below:

heap_lock_updated_tuple_rec()
{
..

if (PageIsAllVisible(BufferGetPage(buf)))
visibilitymap_pin(rel, block, &vmbuffer);
else
vmbuffer = InvalidBuffer;

LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
..
-------------------------- below code is added by commit -5c609a74 -----------
if (TransactionIdDidAbort(HeapTupleHeaderGetXmin(mytup.t_data)))
{
UnlockReleaseBuffer(buf);
return HeapTupleMayBeUpdated;
}
-------------------------------------------------------------

I think the above code forgets to deal with vmbuffer and can lead to a
leak of the same. Attached patch ensures that it deals with vmbuffer
when required.

Registered the patch for next CF:
https://commitfest.postgresql.org/17/1531/

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

#3Alexander Kuzmenkov
a.kuzmenkov@postgrespro.ru
In reply to: Amit Kapila (#2)
Re: heap_lock_updated_tuple_rec can leak a buffer refcount

The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: not tested
Spec compliant: not tested
Documentation: not tested

Looks like a leak indeed, the fix seems right.

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexander Kuzmenkov (#3)
1 attachment(s)
Re: heap_lock_updated_tuple_rec can leak a buffer refcount

Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> writes:

Looks like a leak indeed, the fix seems right.

Yup, it's a leak. It's hard to hit because you need to be starting
with an update of a tuple in an all-visible page; otherwise we never
pin the vm page so there's nothing to leak. But if you lobotomize
the test a few lines above so that it always pins the vm page, then
the regression tests (specifically combocid) reveal the leak, and
show that the proposed patch indeed fixes it.

However ... with said lobotomization, the isolation tests trigger an
Assert(BufferIsPinned(buffer)) inside visibilitymap_pin, showing that
there's another bug here too. That seems to be because at the bottom
of the outer loop, we do

if (vmbuffer != InvalidBuffer)
ReleaseBuffer(vmbuffer);

and then loop back around with vmbuffer still not equal to InvalidBuffer.
This causes the next loop iteration's visibilitymap_pin call to think it
needs to release that vmbuffer pin a second time; kaboom.

And eyeing this, I see still a third problem: if either of the "goto l4"
jumps occur, we'll loop back to l4 with vmbuffer possibly pinned, and then
if the new page isn't all-visible, we'll just set vmbuffer = InvalidBuffer
and leak the pin that way. (If it is all-visible, we unpin the old page
correctly in the visibilitymap_pin call, but that can charitably be
described as accidental.)

In short, this is pretty darn broken. We need to treat the vmbuffer
variable honestly as state that may persist across either the outer loop
or the "goto l4" sub-loop. Furthermore, it's not really cool to use
"vmbuffer == InvalidBuffer" as the indicator of whether we acquired the
vmbuffer pin pre-lock. To do that, we'd be forced to drop the old pin in
the not-all-visible path, even though we might need it right back again.
Also, remembering that one vm page serves many heap pages, even if we have
a vm pin for the "wrong" page it's conceivable that it'd be the right one
for the next time we actually need it. So we should use the
visibilitymap_pin API the way it's meant to be used, and hold any vm pin
we've acquired until the very end of the function.

Hence, I propose the attached patch. The test lobotomization
(the "if (1) //" change) isn't meant for commit but shows how I tested
the take-the-pin paths. This passes make check-world with or without
the lobotomization change.

regards, tom lane

Attachments:

fix-vmbuffer-pin-maintenance-v2.patchtext/x-diff; charset=us-ascii; name=fix-vmbuffer-pin-maintenance-v2.patchDownload
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index dc762f9..42eac78 100644
*** a/src/backend/access/heap/heapam.c
--- b/src/backend/access/heap/heapam.c
*************** heap_lock_updated_tuple_rec(Relation rel
*** 5677,5682 ****
--- 5677,5683 ----
  				new_xmax;
  	TransactionId priorXmax = InvalidTransactionId;
  	bool		cleared_all_frozen = false;
+ 	bool		pinned_desired_page;
  	Buffer		vmbuffer = InvalidBuffer;
  	BlockNumber block;
  
*************** heap_lock_updated_tuple_rec(Relation rel
*** 5698,5704 ****
  			 * chain, and there's no further tuple to lock: return success to
  			 * caller.
  			 */
! 			return HeapTupleMayBeUpdated;
  		}
  
  l4:
--- 5699,5706 ----
  			 * chain, and there's no further tuple to lock: return success to
  			 * caller.
  			 */
! 			result = HeapTupleMayBeUpdated;
! 			goto out_unlocked;
  		}
  
  l4:
*************** l4:
*** 5710,5719 ****
  		 * someone else might be in the middle of changing this, so we'll need
  		 * to recheck after we have the lock.
  		 */
! 		if (PageIsAllVisible(BufferGetPage(buf)))
  			visibilitymap_pin(rel, block, &vmbuffer);
  		else
! 			vmbuffer = InvalidBuffer;
  
  		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
  
--- 5712,5724 ----
  		 * someone else might be in the middle of changing this, so we'll need
  		 * to recheck after we have the lock.
  		 */
! 		if (1) // (PageIsAllVisible(BufferGetPage(buf)))
! 		{
  			visibilitymap_pin(rel, block, &vmbuffer);
+ 			pinned_desired_page = true;
+ 		}
  		else
! 			pinned_desired_page = false;
  
  		LockBuffer(buf, BUFFER_LOCK_EXCLUSIVE);
  
*************** l4:
*** 5722,5729 ****
  		 * all visible while we were busy locking the buffer, we'll have to
  		 * unlock and re-lock, to avoid holding the buffer lock across I/O.
  		 * That's a bit unfortunate, but hopefully shouldn't happen often.
  		 */
! 		if (vmbuffer == InvalidBuffer && PageIsAllVisible(BufferGetPage(buf)))
  		{
  			LockBuffer(buf, BUFFER_LOCK_UNLOCK);
  			visibilitymap_pin(rel, block, &vmbuffer);
--- 5727,5739 ----
  		 * all visible while we were busy locking the buffer, we'll have to
  		 * unlock and re-lock, to avoid holding the buffer lock across I/O.
  		 * That's a bit unfortunate, but hopefully shouldn't happen often.
+ 		 *
+ 		 * Note: in some paths through this function, we will reach here
+ 		 * holding a pin on a vm page that may or may not be the one matching
+ 		 * this page.  If this page isn't all-visible, we won't use the vm
+ 		 * page, but we hold onto such a pin till the end of the function.
  		 */
! 		if (!pinned_desired_page && PageIsAllVisible(BufferGetPage(buf)))
  		{
  			LockBuffer(buf, BUFFER_LOCK_UNLOCK);
  			visibilitymap_pin(rel, block, &vmbuffer);
*************** l4:
*** 5749,5756 ****
  		 */
  		if (TransactionIdDidAbort(HeapTupleHeaderGetXmin(mytup.t_data)))
  		{
! 			UnlockReleaseBuffer(buf);
! 			return HeapTupleMayBeUpdated;
  		}
  
  		old_infomask = mytup.t_data->t_infomask;
--- 5759,5766 ----
  		 */
  		if (TransactionIdDidAbort(HeapTupleHeaderGetXmin(mytup.t_data)))
  		{
! 			result = HeapTupleMayBeUpdated;
! 			goto out_locked;
  		}
  
  		old_infomask = mytup.t_data->t_infomask;
*************** next:
*** 5957,5964 ****
  		priorXmax = HeapTupleHeaderGetUpdateXid(mytup.t_data);
  		ItemPointerCopy(&(mytup.t_data->t_ctid), &tupid);
  		UnlockReleaseBuffer(buf);
- 		if (vmbuffer != InvalidBuffer)
- 			ReleaseBuffer(vmbuffer);
  	}
  
  	result = HeapTupleMayBeUpdated;
--- 5967,5972 ----
*************** next:
*** 5966,5976 ****
  out_locked:
  	UnlockReleaseBuffer(buf);
  
  	if (vmbuffer != InvalidBuffer)
  		ReleaseBuffer(vmbuffer);
  
  	return result;
- 
  }
  
  /*
--- 5974,5984 ----
  out_locked:
  	UnlockReleaseBuffer(buf);
  
+ out_unlocked:
  	if (vmbuffer != InvalidBuffer)
  		ReleaseBuffer(vmbuffer);
  
  	return result;
  }
  
  /*
#5Amit Kapila
amit.kapila16@gmail.com
In reply to: Tom Lane (#4)
Re: heap_lock_updated_tuple_rec can leak a buffer refcount

On Sat, Mar 3, 2018 at 3:26 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alexander Kuzmenkov <a.kuzmenkov@postgrespro.ru> writes:

Looks like a leak indeed, the fix seems right.

Hence, I propose the attached patch. The test lobotomization
(the "if (1) //" change) isn't meant for commit but shows how I tested
the take-the-pin paths. This passes make check-world with or without
the lobotomization change.

Thanks for taking care of this.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com