Remove HeapTuple and Buffer dependency for predicate locking functions

Started by Ashwin Agrawalover 6 years ago25 messages
#1Ashwin Agrawal
aagrawal@pivotal.io
1 attachment(s)

Proposing following changes to make predicate locking and checking
functions generic and remove dependency on HeapTuple and Heap AM. We
made these changes to help with Zedstore. I think the changes should
help Zheap and other AMs in general.

- Change PredicateLockTuple() to PredicateLockTID(). So, instead of
passing HeapTuple to it, just pass ItemPointer and tuple insert
transaction id if known. This was also discussed earlier in [1],
don't think was done in that context but would be helpful in future
if such requirement comes up as well.

- CheckForSerializableConflictIn() take blocknum instead of
buffer. Currently, the function anyways does nothing with the buffer
just needs blocknum. Also, helps to decouple dependency on buffer as
not all AMs may have one to one notion between blocknum and single
buffer. Like for zedstore, tuple is stored across individual column
buffers. So, wish to have way to lock not physical buffer but
logical blocknum.

- CheckForSerializableConflictOut() no more takes HeapTuple nor
buffer, instead just takes xid. Push heap specific parts from
CheckForSerializableConflictOut() into its own function
HeapCheckForSerializableConflictOut() which calls
CheckForSerializableConflictOut(). The alternative option could be
CheckForSerializableConflictOut() take callback function and
callback arguments, which gets called if required after performing
prechecks. Though currently I fell AM having its own wrapper to
perform AM specific task and then calling
CheckForSerializableConflictOut() is fine.

Attaching patch which makes these changes.

This way PredicateLockTID(), CheckForSerializableConflictIn() and
CheckForSerializableConflictOut() functions become usable by any AM.

1]
/messages/by-id/CAEepm=2QbqQ_+KQQCnhKukF6NEAeq4SqiO3Qxe+fHza5-H-jKA@mail.gmail.com

Attachments:

v1-0001-Remove-HeapTuple-dependency-for-predicate-locking.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Remove-HeapTuple-dependency-for-predicate-locking.patchDownload
From aac57c17f078f869bf360677556842d58d5d33ea Mon Sep 17 00:00:00 2001
From: Ashwin Agrawal <aagrawal@pivotal.io>
Date: Fri, 21 Jun 2019 13:42:00 -0700
Subject: [PATCH v1] Remove HeapTuple dependency for predicate locking
 functions.

Following changes help to make predicate locking functions generic and
not tied to particular AM.

- PredicateLockTuple() is renamed to PredicateLockTID(). It takes
  ItemPointer and inserting transaction id now instead of HeapTuple.

- CheckForSerializableConflictIn() takes blocknum instead of buffer

- CheckForSerializableConflictOut() no more takes HeapTuple nor buffer
---
 src/backend/access/gin/ginbtree.c        |   2 +-
 src/backend/access/gin/ginfast.c         |   2 +-
 src/backend/access/gin/gininsert.c       |   4 +-
 src/backend/access/gist/gist.c           |   2 +-
 src/backend/access/hash/hashinsert.c     |   2 +-
 src/backend/access/heap/heapam.c         | 103 +++++++++++++++++---
 src/backend/access/heap/heapam_handler.c |   7 +-
 src/backend/access/index/indexam.c       |   4 +-
 src/backend/access/nbtree/nbtinsert.c    |   4 +-
 src/backend/storage/lmgr/predicate.c     | 119 ++++++++---------------
 src/include/access/heapam.h              |   2 +
 src/include/storage/predicate.h          |   9 +-
 12 files changed, 150 insertions(+), 110 deletions(-)

diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c
index 11a8ed7bbc2..e795375495b 100644
--- a/src/backend/access/gin/ginbtree.c
+++ b/src/backend/access/gin/ginbtree.c
@@ -89,7 +89,7 @@ ginFindLeafPage(GinBtree btree, bool searchMode,
 	stack->predictNumber = 1;
 
 	if (rootConflictCheck)
-		CheckForSerializableConflictIn(btree->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(btree->index, NULL, btree->rootBlkno);
 
 	for (;;)
 	{
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 2b3dd1c677f..f8ffeb06f8a 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -246,7 +246,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
 	 * tree, so it conflicts with all serializable scans.  All scans acquire a
 	 * predicate lock on the metabuffer to represent that.
 	 */
-	CheckForSerializableConflictIn(index, NULL, metabuffer);
+	CheckForSerializableConflictIn(index, NULL, GIN_METAPAGE_BLKNO);
 
 	if (collector->sumsize + collector->ntuples * sizeof(ItemIdData) > GinListPageSize)
 	{
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index 55eab146173..046a20a3d41 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -221,7 +221,7 @@ ginEntryInsert(GinState *ginstate,
 			return;
 		}
 
-		CheckForSerializableConflictIn(ginstate->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(ginstate->index, NULL, BufferGetBlockNumber(stack->buffer));
 		/* modify an existing leaf entry */
 		itup = addItemPointersToLeafTuple(ginstate, itup,
 										  items, nitem, buildStats, stack->buffer);
@@ -230,7 +230,7 @@ ginEntryInsert(GinState *ginstate,
 	}
 	else
 	{
-		CheckForSerializableConflictIn(ginstate->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(ginstate->index, NULL, BufferGetBlockNumber(stack->buffer));
 		/* no match, so construct a new leaf entry */
 		itup = buildFreshLeafTuple(ginstate, attnum, key, category,
 								   items, nitem, buildStats, stack->buffer);
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 470b121e7da..6c0cc20fe1a 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -1273,7 +1273,7 @@ gistinserttuples(GISTInsertState *state, GISTInsertStack *stack,
 	 * Check for any rw conflicts (in serializable isolation level) just
 	 * before we intend to modify the page
 	 */
-	CheckForSerializableConflictIn(state->r, NULL, stack->buffer);
+	CheckForSerializableConflictIn(state->r, NULL, BufferGetBlockNumber(stack->buffer));
 
 	/* Insert the tuple(s) to the page, splitting the page if necessary */
 	is_split = gistplacetopage(state->r, state->freespace, giststate,
diff --git a/src/backend/access/hash/hashinsert.c b/src/backend/access/hash/hashinsert.c
index 5321762d5ea..e3fb47f9e3d 100644
--- a/src/backend/access/hash/hashinsert.c
+++ b/src/backend/access/hash/hashinsert.c
@@ -88,7 +88,7 @@ restart_insert:
 										  &usedmetap);
 	Assert(usedmetap != NULL);
 
-	CheckForSerializableConflictIn(rel, NULL, buf);
+	CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(buf));
 
 	/* remember the primary bucket buffer to release the pin on it at end. */
 	bucket_buf = buf;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index d768b9b061c..fb574597876 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -446,7 +446,7 @@ heapgetpage(TableScanDesc sscan, BlockNumber page)
 			else
 				valid = HeapTupleSatisfiesVisibility(&loctup, snapshot, buffer);
 
-			CheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
+			HeapCheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
 											&loctup, buffer, snapshot);
 
 			if (valid)
@@ -668,7 +668,7 @@ heapgettup(HeapScanDesc scan,
 													 snapshot,
 													 scan->rs_cbuf);
 
-				CheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
+				HeapCheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
 												tuple, scan->rs_cbuf,
 												snapshot);
 
@@ -1477,9 +1477,10 @@ heap_fetch(Relation relation,
 	valid = HeapTupleSatisfiesVisibility(tuple, snapshot, buffer);
 
 	if (valid)
-		PredicateLockTuple(relation, tuple, snapshot);
+		PredicateLockTID(relation, &(tuple->t_self), snapshot,
+						 HeapTupleHeaderGetXmin(tuple->t_data));
 
-	CheckForSerializableConflictOut(valid, relation, tuple, buffer, snapshot);
+	HeapCheckForSerializableConflictOut(valid, relation, tuple, buffer, snapshot);
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
@@ -1613,7 +1614,7 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 
 			/* If it's visible per the snapshot, we must return it */
 			valid = HeapTupleSatisfiesVisibility(heapTuple, snapshot, buffer);
-			CheckForSerializableConflictOut(valid, relation, heapTuple,
+			HeapCheckForSerializableConflictOut(valid, relation, heapTuple,
 											buffer, snapshot);
 			/* reset to original, non-redirected, tid */
 			heapTuple->t_self = *tid;
@@ -1621,7 +1622,8 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 			if (valid)
 			{
 				ItemPointerSetOffsetNumber(tid, offnum);
-				PredicateLockTuple(relation, heapTuple, snapshot);
+				PredicateLockTID(relation, &(heapTuple)->t_self, snapshot,
+								 HeapTupleHeaderGetXmin(heapTuple->t_data));
 				if (all_dead)
 					*all_dead = false;
 				return true;
@@ -1755,7 +1757,7 @@ heap_get_latest_tid(TableScanDesc sscan,
 		 * candidate.
 		 */
 		valid = HeapTupleSatisfiesVisibility(&tp, snapshot, buffer);
-		CheckForSerializableConflictOut(valid, relation, &tp, buffer, snapshot);
+		HeapCheckForSerializableConflictOut(valid, relation, &tp, buffer, snapshot);
 		if (valid)
 			*tid = ctid;
 
@@ -1910,7 +1912,7 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 	 * lock "gaps" as index page locks do.  So we don't need to specify a
 	 * buffer when making the call, which makes for a faster check.
 	 */
-	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
+	CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber);
 
 	/* NO EREPORT(ERROR) from here till changes are logged */
 	START_CRIT_SECTION();
@@ -2164,7 +2166,7 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 	 * lock "gaps" as index page locks do.  So we don't need to specify a
 	 * buffer when making the call, which makes for a faster check.
 	 */
-	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
+	CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber);
 
 	ndone = 0;
 	while (ndone < ntuples)
@@ -2355,7 +2357,7 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 	 * lock "gaps" as index page locks do.  So we don't need to specify a
 	 * buffer when making the call.
 	 */
-	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
+	CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber);
 
 	/*
 	 * If tuples are cachable, mark them for invalidation from the caches in
@@ -2669,7 +2671,7 @@ l1:
 	 * being visible to the scan (i.e., an exclusive buffer content lock is
 	 * continuously held from this point until the tuple delete is visible).
 	 */
-	CheckForSerializableConflictIn(relation, &tp, buffer);
+	CheckForSerializableConflictIn(relation, tid, BufferGetBlockNumber(buffer));
 
 	/* replace cid with a combo cid if necessary */
 	HeapTupleHeaderAdjustCmax(tp.t_data, &cid, &iscombo);
@@ -3584,7 +3586,7 @@ l2:
 	 * will include checking the relation level, there is no benefit to a
 	 * separate check for the new tuple.
 	 */
-	CheckForSerializableConflictIn(relation, &oldtup, buffer);
+	CheckForSerializableConflictIn(relation, otid, BufferGetBlockNumber(buffer));
 
 	/*
 	 * At this point newbuf and buffer are both pinned and locked, and newbuf
@@ -9044,3 +9046,80 @@ heap_mask(char *pagedata, BlockNumber blkno)
 		}
 	}
 }
+
+/*
+ * HeapCheckForSerializableConflictOut
+ *		We are reading a tuple which has been modified.  If it is visible to
+ *		us but has been deleted, that indicates a rw-conflict out.  If it's
+ *		not visible and was created by a concurrent (overlapping)
+ *		serializable transaction, that is also a rw-conflict out,
+ *
+ * We will determine the top level xid of the writing transaction with which
+ * we may be in conflict, and check for overlap with our own transaction.
+ * If the transactions overlap (i.e., they cannot see each other's writes),
+ * then we have a conflict out.
+ *
+ * This function should be called just about anywhere in heapam.c where a
+ * tuple has been read. The caller must hold at least a shared lock on the
+ * buffer, because this function might set hint bits on the tuple. There is
+ * currently no known reason to call this function from an index AM.
+ */
+void
+HeapCheckForSerializableConflictOut(bool visible, Relation relation,
+									HeapTuple tuple, Buffer buffer,
+									Snapshot snapshot)
+{
+	TransactionId xid;
+	HTSV_Result htsvResult;
+
+	if (!CheckForSerializableConflictOutNeeded(relation, snapshot))
+		return;
+
+	/*
+	 * Check to see whether the tuple has been written to by a concurrent
+	 * transaction, either to create it not visible to us, or to delete it
+	 * while it is visible to us.  The "visible" bool indicates whether the
+	 * tuple is visible to us, while HeapTupleSatisfiesVacuum checks what else
+	 * is going on with it.
+	 */
+	htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
+	switch (htsvResult)
+	{
+		case HEAPTUPLE_LIVE:
+			if (visible)
+				return;
+			xid = HeapTupleHeaderGetXmin(tuple->t_data);
+			break;
+		case HEAPTUPLE_RECENTLY_DEAD:
+			if (!visible)
+				return;
+			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
+			break;
+		case HEAPTUPLE_DELETE_IN_PROGRESS:
+			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
+			break;
+		case HEAPTUPLE_INSERT_IN_PROGRESS:
+			xid = HeapTupleHeaderGetXmin(tuple->t_data);
+			break;
+		case HEAPTUPLE_DEAD:
+			return;
+		default:
+
+			/*
+			 * The only way to get to this default clause is if a new value is
+			 * added to the enum type without adding it to this switch
+			 * statement.  That's a bug, so elog.
+			 */
+			elog(ERROR, "unrecognized return value from HeapTupleSatisfiesVacuum: %u", htsvResult);
+
+			/*
+			 * In spite of having all enum values covered and calling elog on
+			 * this default, some compilers think this is a code path which
+			 * allows xid to be used below without initialization. Silence
+			 * that warning.
+			 */
+			xid = InvalidTransactionId;
+	}
+
+	return CheckForSerializableConflictOut(relation, xid, snapshot);
+}
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index fc19f40a2e3..29946711847 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2273,9 +2273,10 @@ heapam_scan_bitmap_next_block(TableScanDesc scan,
 			if (valid)
 			{
 				hscan->rs_vistuples[ntup++] = offnum;
-				PredicateLockTuple(scan->rs_rd, &loctup, snapshot);
+				PredicateLockTID(scan->rs_rd, &loctup.t_self, snapshot,
+								 HeapTupleHeaderGetXmin(loctup.t_data));
 			}
-			CheckForSerializableConflictOut(valid, scan->rs_rd, &loctup,
+			HeapCheckForSerializableConflictOut(valid, scan->rs_rd, &loctup,
 											buffer, snapshot);
 		}
 	}
@@ -2463,7 +2464,7 @@ heapam_scan_sample_next_tuple(TableScanDesc scan, SampleScanState *scanstate,
 
 			/* in pagemode, heapgetpage did this for us */
 			if (!pagemode)
-				CheckForSerializableConflictOut(visible, scan->rs_rd, tuple,
+				HeapCheckForSerializableConflictOut(visible, scan->rs_rd, tuple,
 												hscan->rs_cbuf, scan->rs_snapshot);
 
 			/* Try next tuple from same page. */
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index aefdd2916de..61ed3167fec 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -180,8 +180,8 @@ index_insert(Relation indexRelation,
 
 	if (!(indexRelation->rd_indam->ampredlocks))
 		CheckForSerializableConflictIn(indexRelation,
-									   (HeapTuple) NULL,
-									   InvalidBuffer);
+									   (ItemPointer) NULL,
+									   InvalidBlockNumber);
 
 	return indexRelation->rd_indam->aminsert(indexRelation, values, isnull,
 											 heap_t_ctid, heapRelation,
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index 602f8849d4a..22fcb26a069 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -290,7 +290,7 @@ top:
 		 * checkingunique and !heapkeyspace cases, but it's okay to use the
 		 * first page the value could be on (with scantid omitted) instead.
 		 */
-		CheckForSerializableConflictIn(rel, NULL, insertstate.buf);
+		CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(insertstate.buf));
 
 		/*
 		 * Do the insertion.  Note that insertstate contains cached binary
@@ -533,7 +533,7 @@ _bt_check_unique(Relation rel, BTInsertState insertstate, Relation heapRel,
 					 * otherwise be masked by this unique constraint
 					 * violation.
 					 */
-					CheckForSerializableConflictIn(rel, NULL, insertstate->buf);
+					CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(insertstate->buf));
 
 					/*
 					 * This is a definite conflict.  Break the tuple down into
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 2fedbc4c15f..8b50dc4fbdb 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -163,8 +163,8 @@
  *		PredicateLockRelation(Relation relation, Snapshot snapshot)
  *		PredicateLockPage(Relation relation, BlockNumber blkno,
  *						Snapshot snapshot)
- *		PredicateLockTuple(Relation relation, HeapTuple tuple,
- *						Snapshot snapshot)
+ *		PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
+ *						 TransactionId insert_xid)
  *		PredicateLockPageSplit(Relation relation, BlockNumber oldblkno,
  *							   BlockNumber newblkno)
  *		PredicateLockPageCombine(Relation relation, BlockNumber oldblkno,
@@ -173,11 +173,10 @@
  *		ReleasePredicateLocks(bool isCommit, bool isReadOnlySafe)
  *
  * conflict detection (may also trigger rollback)
- *		CheckForSerializableConflictOut(bool visible, Relation relation,
- *										HeapTupleData *tup, Buffer buffer,
+ *		CheckForSerializableConflictOut(Relation relation, TransactionId xid,
  *										Snapshot snapshot)
- *		CheckForSerializableConflictIn(Relation relation, HeapTupleData *tup,
- *									   Buffer buffer)
+ *		CheckForSerializableConflictIn(Relation relation, ItemPointer tid,
+ *									   BlockNumber blkno)
  *		CheckTableForSerializableConflictIn(Relation relation)
  *
  * final rollback checking
@@ -193,8 +192,6 @@
 
 #include "postgres.h"
 
-#include "access/heapam.h"
-#include "access/htup_details.h"
 #include "access/parallel.h"
 #include "access/slru.h"
 #include "access/subtrans.h"
@@ -2538,37 +2535,34 @@ PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot)
 }
 
 /*
- *		PredicateLockTuple
+ *		PredicateLockTID
  *
  * Gets a predicate lock at the tuple level.
  * Skip if not in full serializable transaction isolation level.
  * Skip if this is a temporary table.
  */
 void
-PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot)
+PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
+				 TransactionId insert_xid)
 {
 	PREDICATELOCKTARGETTAG tag;
-	ItemPointer tid;
-	TransactionId targetxmin;
 
 	if (!SerializationNeededForRead(relation, snapshot))
 		return;
 
 	/*
-	 * If it's a heap tuple, return if this xact wrote it.
+	 * Return if this xact wrote it.
 	 */
 	if (relation->rd_index == NULL)
 	{
 		TransactionId myxid;
 
-		targetxmin = HeapTupleHeaderGetXmin(tuple->t_data);
-
 		myxid = GetTopTransactionIdIfAny();
 		if (TransactionIdIsValid(myxid))
 		{
-			if (TransactionIdFollowsOrEquals(targetxmin, TransactionXmin))
+			if (TransactionIdFollowsOrEquals(insert_xid, TransactionXmin))
 			{
-				TransactionId xid = SubTransGetTopmostTransaction(targetxmin);
+				TransactionId xid = SubTransGetTopmostTransaction(insert_xid);
 
 				if (TransactionIdEquals(xid, myxid))
 				{
@@ -2591,7 +2585,6 @@ PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot)
 	if (PredicateLockExists(&tag))
 		return;
 
-	tid = &(tuple->t_self);
 	SET_PREDICATELOCKTARGETTAG_TUPLE(tag,
 									 relation->rd_node.dbNode,
 									 relation->rd_id,
@@ -4036,33 +4029,43 @@ XidIsConcurrent(TransactionId xid)
 	return false;
 }
 
+bool
+CheckForSerializableConflictOutNeeded(Relation relation, Snapshot snapshot)
+{
+	if (!SerializationNeededForRead(relation, snapshot))
+		return false;
+
+	/* Check if someone else has already decided that we need to die */
+	if (SxactIsDoomed(MySerializableXact))
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("could not serialize access due to read/write dependencies among transactions"),
+				 errdetail_internal("Reason code: Canceled on identification as a pivot, during conflict out checking."),
+				 errhint("The transaction might succeed if retried.")));
+	}
+
+	return true;
+}
+
 /*
  * CheckForSerializableConflictOut
  *		We are reading a tuple which has been modified.  If it is visible to
  *		us but has been deleted, that indicates a rw-conflict out.  If it's
  *		not visible and was created by a concurrent (overlapping)
- *		serializable transaction, that is also a rw-conflict out,
+ *		serializable transaction, that is also a rw-conflict out.
  *
  * We will determine the top level xid of the writing transaction with which
  * we may be in conflict, and check for overlap with our own transaction.
  * If the transactions overlap (i.e., they cannot see each other's writes),
  * then we have a conflict out.
- *
- * This function should be called just about anywhere in heapam.c where a
- * tuple has been read. The caller must hold at least a shared lock on the
- * buffer, because this function might set hint bits on the tuple. There is
- * currently no known reason to call this function from an index AM.
  */
 void
-CheckForSerializableConflictOut(bool visible, Relation relation,
-								HeapTuple tuple, Buffer buffer,
-								Snapshot snapshot)
+CheckForSerializableConflictOut(Relation relation, TransactionId xid, Snapshot snapshot)
 {
-	TransactionId xid;
 	SERIALIZABLEXIDTAG sxidtag;
 	SERIALIZABLEXID *sxid;
 	SERIALIZABLEXACT *sxact;
-	HTSV_Result htsvResult;
 
 	if (!SerializationNeededForRead(relation, snapshot))
 		return;
@@ -4077,51 +4080,6 @@ CheckForSerializableConflictOut(bool visible, Relation relation,
 				 errhint("The transaction might succeed if retried.")));
 	}
 
-	/*
-	 * Check to see whether the tuple has been written to by a concurrent
-	 * transaction, either to create it not visible to us, or to delete it
-	 * while it is visible to us.  The "visible" bool indicates whether the
-	 * tuple is visible to us, while HeapTupleSatisfiesVacuum checks what else
-	 * is going on with it.
-	 */
-	htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
-	switch (htsvResult)
-	{
-		case HEAPTUPLE_LIVE:
-			if (visible)
-				return;
-			xid = HeapTupleHeaderGetXmin(tuple->t_data);
-			break;
-		case HEAPTUPLE_RECENTLY_DEAD:
-			if (!visible)
-				return;
-			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
-			break;
-		case HEAPTUPLE_DELETE_IN_PROGRESS:
-			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
-			break;
-		case HEAPTUPLE_INSERT_IN_PROGRESS:
-			xid = HeapTupleHeaderGetXmin(tuple->t_data);
-			break;
-		case HEAPTUPLE_DEAD:
-			return;
-		default:
-
-			/*
-			 * The only way to get to this default clause is if a new value is
-			 * added to the enum type without adding it to this switch
-			 * statement.  That's a bug, so elog.
-			 */
-			elog(ERROR, "unrecognized return value from HeapTupleSatisfiesVacuum: %u", htsvResult);
-
-			/*
-			 * In spite of having all enum values covered and calling elog on
-			 * this default, some compilers think this is a code path which
-			 * allows xid to be used below without initialization. Silence
-			 * that warning.
-			 */
-			xid = InvalidTransactionId;
-	}
 	Assert(TransactionIdIsValid(xid));
 	Assert(TransactionIdFollowsOrEquals(xid, TransactionXmin));
 
@@ -4439,8 +4397,7 @@ CheckTargetForConflictsIn(PREDICATELOCKTARGETTAG *targettag)
  * tuple itself.
  */
 void
-CheckForSerializableConflictIn(Relation relation, HeapTuple tuple,
-							   Buffer buffer)
+CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber blkno)
 {
 	PREDICATELOCKTARGETTAG targettag;
 
@@ -4470,22 +4427,22 @@ CheckForSerializableConflictIn(Relation relation, HeapTuple tuple,
 	 * It is not possible to take and hold a lock across the checks for all
 	 * granularities because each target could be in a separate partition.
 	 */
-	if (tuple != NULL)
+	if (tid != NULL)
 	{
 		SET_PREDICATELOCKTARGETTAG_TUPLE(targettag,
 										 relation->rd_node.dbNode,
 										 relation->rd_id,
-										 ItemPointerGetBlockNumber(&(tuple->t_self)),
-										 ItemPointerGetOffsetNumber(&(tuple->t_self)));
+										 ItemPointerGetBlockNumber(tid),
+										 ItemPointerGetOffsetNumber(tid));
 		CheckTargetForConflictsIn(&targettag);
 	}
 
-	if (BufferIsValid(buffer))
+	if (blkno != InvalidBlockNumber)
 	{
 		SET_PREDICATELOCKTARGETTAG_PAGE(targettag,
 										relation->rd_node.dbNode,
 										relation->rd_id,
-										BufferGetBlockNumber(buffer));
+										blkno);
 		CheckTargetForConflictsIn(&targettag);
 	}
 
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index dffb57bf11a..99cd574b2cb 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -217,5 +217,7 @@ extern bool ResolveCminCmaxDuringDecoding(struct HTAB *tuplecid_data,
 										  HeapTuple htup,
 										  Buffer buffer,
 										  CommandId *cmin, CommandId *cmax);
+extern void HeapCheckForSerializableConflictOut(bool valid, Relation relation, HeapTuple tuple,
+												Buffer buffer, Snapshot snapshot);
 
 #endif							/* HEAPAM_H */
diff --git a/src/include/storage/predicate.h b/src/include/storage/predicate.h
index 376245ecd70..7e1f67b8edb 100644
--- a/src/include/storage/predicate.h
+++ b/src/include/storage/predicate.h
@@ -57,16 +57,17 @@ extern void SetSerializableTransactionSnapshot(Snapshot snapshot,
 extern void RegisterPredicateLockingXid(TransactionId xid);
 extern void PredicateLockRelation(Relation relation, Snapshot snapshot);
 extern void PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot);
-extern void PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot);
+extern void PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
+							 TransactionId insert_xid);
 extern void PredicateLockPageSplit(Relation relation, BlockNumber oldblkno, BlockNumber newblkno);
 extern void PredicateLockPageCombine(Relation relation, BlockNumber oldblkno, BlockNumber newblkno);
 extern void TransferPredicateLocksToHeapRelation(Relation relation);
 extern void ReleasePredicateLocks(bool isCommit, bool isReadOnlySafe);
 
 /* conflict detection (may also trigger rollback) */
-extern void CheckForSerializableConflictOut(bool valid, Relation relation, HeapTuple tuple,
-											Buffer buffer, Snapshot snapshot);
-extern void CheckForSerializableConflictIn(Relation relation, HeapTuple tuple, Buffer buffer);
+extern bool CheckForSerializableConflictOutNeeded(Relation relation, Snapshot snapshot);
+extern void CheckForSerializableConflictOut(Relation relation, TransactionId xid, Snapshot snapshot);
+extern void CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber blkno);
 extern void CheckTableForSerializableConflictIn(Relation relation);
 
 /* final rollback checking */
-- 
2.19.1

#2Andres Freund
andres@anarazel.de
In reply to: Ashwin Agrawal (#1)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Hi,

On 2019-06-24 10:41:06 -0700, Ashwin Agrawal wrote:

Proposing following changes to make predicate locking and checking
functions generic and remove dependency on HeapTuple and Heap AM. We
made these changes to help with Zedstore. I think the changes should
help Zheap and other AMs in general.

Indeed.

- Change PredicateLockTuple() to PredicateLockTID(). So, instead of
passing HeapTuple to it, just pass ItemPointer and tuple insert
transaction id if known. This was also discussed earlier in [1],
don't think was done in that context but would be helpful in future
if such requirement comes up as well.

Right.

- CheckForSerializableConflictIn() take blocknum instead of
buffer. Currently, the function anyways does nothing with the buffer
just needs blocknum. Also, helps to decouple dependency on buffer as
not all AMs may have one to one notion between blocknum and single
buffer. Like for zedstore, tuple is stored across individual column
buffers. So, wish to have way to lock not physical buffer but
logical blocknum.

Hm. I wonder if we somehow ought to generalize the granularity scheme
for predicate locks to not be tuple/page/relation. But even if, that's
probably a separate patch.

- CheckForSerializableConflictOut() no more takes HeapTuple nor
buffer, instead just takes xid. Push heap specific parts from
CheckForSerializableConflictOut() into its own function
HeapCheckForSerializableConflictOut() which calls
CheckForSerializableConflictOut(). The alternative option could be
CheckForSerializableConflictOut() take callback function and
callback arguments, which gets called if required after performing
prechecks. Though currently I fell AM having its own wrapper to
perform AM specific task and then calling
CheckForSerializableConflictOut() is fine.

I think it's right to move the xid handling out of
CheckForSerializableConflictOut(). But I think we also ought to move the
subtransaction handling out of the function - e.g. zheap doesn't
want/need that.

Attaching patch which makes these changes.

Please make sure that there's a CF entry for this (I'm in a plane with a
super slow connection, otherwise I'd check).

Greetings,

Andres Freund

#3Thomas Munro
thomas.munro@gmail.com
In reply to: Andres Freund (#2)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de> wrote:

On 2019-06-24 10:41:06 -0700, Ashwin Agrawal wrote:

Proposing following changes to make predicate locking and checking
functions generic and remove dependency on HeapTuple and Heap AM. We
made these changes to help with Zedstore. I think the changes should
help Zheap and other AMs in general.

Indeed.

+1

- Change PredicateLockTuple() to PredicateLockTID(). So, instead of
passing HeapTuple to it, just pass ItemPointer and tuple insert
transaction id if known. This was also discussed earlier in [1],
don't think was done in that context but would be helpful in future
if such requirement comes up as well.

Right.

+1

- CheckForSerializableConflictIn() take blocknum instead of
buffer. Currently, the function anyways does nothing with the buffer
just needs blocknum. Also, helps to decouple dependency on buffer as
not all AMs may have one to one notion between blocknum and single
buffer. Like for zedstore, tuple is stored across individual column
buffers. So, wish to have way to lock not physical buffer but
logical blocknum.

Hm. I wonder if we somehow ought to generalize the granularity scheme
for predicate locks to not be tuple/page/relation. But even if, that's
probably a separate patch.

What do you have in mind? This is certainly the traditional way to do
lock hierarchies (archeological fun: we used to have
src/backend/storage/lock/multi.c, a "standard multi-level lock manager
as per the Gray paper", before commits 3f7fbf85 and e6e9e18e
decommissioned it; SSI brought the concept back again in a parallel
lock manager), but admittedly it has assumptions about physical
storage baked into the naming. Perhaps you just want to give those
things different labels, "TID range" instead of page, for the benefit
of "logical" TID users? Perhaps you want to permit more levels? That
seems premature as long as TIDs are defined in terms of blocks and
offsets, so this stuff is reflected all over the source tree...

- CheckForSerializableConflictOut() no more takes HeapTuple nor
buffer, instead just takes xid. Push heap specific parts from
CheckForSerializableConflictOut() into its own function
HeapCheckForSerializableConflictOut() which calls
CheckForSerializableConflictOut(). The alternative option could be
CheckForSerializableConflictOut() take callback function and
callback arguments, which gets called if required after performing
prechecks. Though currently I fell AM having its own wrapper to
perform AM specific task and then calling
CheckForSerializableConflictOut() is fine.

I think it's right to move the xid handling out of
CheckForSerializableConflictOut(). But I think we also ought to move the
subtransaction handling out of the function - e.g. zheap doesn't
want/need that.

Thoughts on this Ashwin?

Attaching patch which makes these changes.

Please make sure that there's a CF entry for this (I'm in a plane with a
super slow connection, otherwise I'd check).

This all makes sense, and I'd like to be able to commit this soon. We
come up with something nearly identical for zheap. I'm subscribing
Kuntal Ghosh to see if he has comments, as he worked on that. The
main point of difference seems to be the assumption about how
subtransactions work.

--
Thomas Munro
https://enterprisedb.com

#4Ashwin Agrawal
aagrawal@pivotal.io
In reply to: Thomas Munro (#3)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de> wrote:

- CheckForSerializableConflictOut() no more takes HeapTuple nor
buffer, instead just takes xid. Push heap specific parts from
CheckForSerializableConflictOut() into its own function
HeapCheckForSerializableConflictOut() which calls
CheckForSerializableConflictOut(). The alternative option could be
CheckForSerializableConflictOut() take callback function and
callback arguments, which gets called if required after performing
prechecks. Though currently I fell AM having its own wrapper to
perform AM specific task and then calling
CheckForSerializableConflictOut() is fine.

I think it's right to move the xid handling out of
CheckForSerializableConflictOut(). But I think we also ought to move the
subtransaction handling out of the function - e.g. zheap doesn't
want/need that.

Thoughts on this Ashwin?

I think the only part its doing for sub-transaction is
SubTransGetTopmostTransaction(xid). If xid passed to this function is
already top most transaction which is case for zheap and zedstore, then
there is no downside to keeping that code here in common place.

#5Andres Freund
andres@anarazel.de
In reply to: Thomas Munro (#3)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Hi,

On 2019-07-31 09:57:58 +1200, Thomas Munro wrote:

On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de> wrote:

Hm. I wonder if we somehow ought to generalize the granularity scheme
for predicate locks to not be tuple/page/relation. But even if, that's
probably a separate patch.

What do you have in mind?

My concern is that continuing to inferring the granularity levels from
the tid doesn't seem like a great path forward. An AMs use of tids might
not necessarily be very amenable to that, if the mapping isn't actually
block based.

Perhaps you just want to give those things different labels, "TID
range" instead of page, for the benefit of "logical" TID users?
Perhaps you want to permit more levels? That seems premature as long
as TIDs are defined in terms of blocks and offsets, so this stuff is
reflected all over the source tree...

I'm mostly wondering if the different levels shouldn't be computed
outside of predicate.c.

Greetings,

Andres Freund

#6Andres Freund
andres@anarazel.de
In reply to: Ashwin Agrawal (#4)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Hi,

On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote:

On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de> wrote:

- CheckForSerializableConflictOut() no more takes HeapTuple nor
buffer, instead just takes xid. Push heap specific parts from
CheckForSerializableConflictOut() into its own function
HeapCheckForSerializableConflictOut() which calls
CheckForSerializableConflictOut(). The alternative option could be
CheckForSerializableConflictOut() take callback function and
callback arguments, which gets called if required after performing
prechecks. Though currently I fell AM having its own wrapper to
perform AM specific task and then calling
CheckForSerializableConflictOut() is fine.

I think it's right to move the xid handling out of
CheckForSerializableConflictOut(). But I think we also ought to move the
subtransaction handling out of the function - e.g. zheap doesn't
want/need that.

Thoughts on this Ashwin?

I think the only part its doing for sub-transaction is
SubTransGetTopmostTransaction(xid). If xid passed to this function is
already top most transaction which is case for zheap and zedstore, then
there is no downside to keeping that code here in common place.

Well, it's far from a cheap function. It'll do unnecessary on-disk
lookups in many cases. I'd call that quite a downside.

Greetings,

Andres Freund

#7Ashwin Agrawal
aagrawal@pivotal.io
In reply to: Andres Freund (#6)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

On Wed, Jul 31, 2019 at 10:55 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote:

On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <thomas.munro@gmail.com>

wrote:

On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de>

wrote:

- CheckForSerializableConflictOut() no more takes HeapTuple nor
buffer, instead just takes xid. Push heap specific parts from
CheckForSerializableConflictOut() into its own function
HeapCheckForSerializableConflictOut() which calls
CheckForSerializableConflictOut(). The alternative option could

be

CheckForSerializableConflictOut() take callback function and
callback arguments, which gets called if required after

performing

prechecks. Though currently I fell AM having its own wrapper to
perform AM specific task and then calling
CheckForSerializableConflictOut() is fine.

I think it's right to move the xid handling out of
CheckForSerializableConflictOut(). But I think we also ought to move

the

subtransaction handling out of the function - e.g. zheap doesn't
want/need that.

Thoughts on this Ashwin?

I think the only part its doing for sub-transaction is
SubTransGetTopmostTransaction(xid). If xid passed to this function is
already top most transaction which is case for zheap and zedstore, then
there is no downside to keeping that code here in common place.

Well, it's far from a cheap function. It'll do unnecessary on-disk
lookups in many cases. I'd call that quite a downside.

Okay, agree, its costly function and better to avoid the call if possible.

Instead of moving the handling out of the function, how do feel about
adding boolean isTopTransactionId argument to function
CheckForSerializableConflictOut(). The AMs, which implicitly know, only
pass top transaction Id to this function, can pass true and avoid the
function call to SubTransGetTopmostTransaction(xid). With this
subtransaction code remains in generic place and AMs intending to use it
continue to leverage the common code, plus explicitly clarifies the
behavior as well.

#8Ashwin Agrawal
aagrawal@pivotal.io
In reply to: Ashwin Agrawal (#7)
1 attachment(s)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

On Wed, Jul 31, 2019 at 12:37 PM Ashwin Agrawal <aagrawal@pivotal.io> wrote:

On Wed, Jul 31, 2019 at 10:55 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote:

On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <thomas.munro@gmail.com>

wrote:

On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de>

wrote:

- CheckForSerializableConflictOut() no more takes HeapTuple nor
buffer, instead just takes xid. Push heap specific parts from
CheckForSerializableConflictOut() into its own function
HeapCheckForSerializableConflictOut() which calls
CheckForSerializableConflictOut(). The alternative option could

be

CheckForSerializableConflictOut() take callback function and
callback arguments, which gets called if required after

performing

prechecks. Though currently I fell AM having its own wrapper to
perform AM specific task and then calling
CheckForSerializableConflictOut() is fine.

I think it's right to move the xid handling out of
CheckForSerializableConflictOut(). But I think we also ought to

move the

subtransaction handling out of the function - e.g. zheap doesn't
want/need that.

Thoughts on this Ashwin?

I think the only part its doing for sub-transaction is
SubTransGetTopmostTransaction(xid). If xid passed to this function is
already top most transaction which is case for zheap and zedstore, then
there is no downside to keeping that code here in common place.

Well, it's far from a cheap function. It'll do unnecessary on-disk
lookups in many cases. I'd call that quite a downside.

Okay, agree, its costly function and better to avoid the call if possible.

Instead of moving the handling out of the function, how do feel about
adding boolean isTopTransactionId argument to function
CheckForSerializableConflictOut(). The AMs, which implicitly know, only
pass top transaction Id to this function, can pass true and avoid the
function call to SubTransGetTopmostTransaction(xid). With this
subtransaction code remains in generic place and AMs intending to use it
continue to leverage the common code, plus explicitly clarifies the
behavior as well.

Added argument to function to make the subtransaction handling optional in
attached v2 of patch.

Attachments:

v2-0001-Remove-HeapTuple-dependency-for-predicate-locking.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Remove-HeapTuple-dependency-for-predicate-locking.patchDownload
From 4ca67592f34b63cf80247068a128407d800c62a6 Mon Sep 17 00:00:00 2001
From: Ashwin Agrawal <aagrawal@pivotal.io>
Date: Wed, 31 Jul 2019 13:53:52 -0700
Subject: [PATCH v2] Remove HeapTuple dependency for predicate locking
 functions.

Following changes help to make predicate locking functions generic and
not tied to particular AM.

- PredicateLockTuple() is renamed to PredicateLockTID(). It takes
  ItemPointer and inserting transaction id now instead of HeapTuple.

- CheckForSerializableConflictIn() takes blocknum instead of buffer

- CheckForSerializableConflictOut() no more takes HeapTuple nor buffer
---
 src/backend/access/gin/ginbtree.c        |   2 +-
 src/backend/access/gin/ginfast.c         |   2 +-
 src/backend/access/gin/gininsert.c       |   4 +-
 src/backend/access/gist/gist.c           |   2 +-
 src/backend/access/hash/hashinsert.c     |   2 +-
 src/backend/access/heap/heapam.c         | 104 ++++++++++++++++--
 src/backend/access/heap/heapam_handler.c |   7 +-
 src/backend/access/index/indexam.c       |   4 +-
 src/backend/access/nbtree/nbtinsert.c    |   4 +-
 src/backend/storage/lmgr/predicate.c     | 134 ++++++++---------------
 src/include/access/heapam.h              |   2 +
 src/include/storage/predicate.h          |  10 +-
 12 files changed, 162 insertions(+), 115 deletions(-)

diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c
index 11a8ed7bbc2..e795375495b 100644
--- a/src/backend/access/gin/ginbtree.c
+++ b/src/backend/access/gin/ginbtree.c
@@ -89,7 +89,7 @@ ginFindLeafPage(GinBtree btree, bool searchMode,
 	stack->predictNumber = 1;
 
 	if (rootConflictCheck)
-		CheckForSerializableConflictIn(btree->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(btree->index, NULL, btree->rootBlkno);
 
 	for (;;)
 	{
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 439a91b3e61..d7b52476817 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -246,7 +246,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
 	 * tree, so it conflicts with all serializable scans.  All scans acquire a
 	 * predicate lock on the metabuffer to represent that.
 	 */
-	CheckForSerializableConflictIn(index, NULL, metabuffer);
+	CheckForSerializableConflictIn(index, NULL, GIN_METAPAGE_BLKNO);
 
 	if (collector->sumsize + collector->ntuples * sizeof(ItemIdData) > GinListPageSize)
 	{
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index 55eab146173..046a20a3d41 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -221,7 +221,7 @@ ginEntryInsert(GinState *ginstate,
 			return;
 		}
 
-		CheckForSerializableConflictIn(ginstate->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(ginstate->index, NULL, BufferGetBlockNumber(stack->buffer));
 		/* modify an existing leaf entry */
 		itup = addItemPointersToLeafTuple(ginstate, itup,
 										  items, nitem, buildStats, stack->buffer);
@@ -230,7 +230,7 @@ ginEntryInsert(GinState *ginstate,
 	}
 	else
 	{
-		CheckForSerializableConflictIn(ginstate->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(ginstate->index, NULL, BufferGetBlockNumber(stack->buffer));
 		/* no match, so construct a new leaf entry */
 		itup = buildFreshLeafTuple(ginstate, attnum, key, category,
 								   items, nitem, buildStats, stack->buffer);
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index e9ca4b82527..4db54a42761 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -1263,7 +1263,7 @@ gistinserttuples(GISTInsertState *state, GISTInsertStack *stack,
 	 * Check for any rw conflicts (in serializable isolation level) just
 	 * before we intend to modify the page
 	 */
-	CheckForSerializableConflictIn(state->r, NULL, stack->buffer);
+	CheckForSerializableConflictIn(state->r, NULL, BufferGetBlockNumber(stack->buffer));
 
 	/* Insert the tuple(s) to the page, splitting the page if necessary */
 	is_split = gistplacetopage(state->r, state->freespace, giststate,
diff --git a/src/backend/access/hash/hashinsert.c b/src/backend/access/hash/hashinsert.c
index 5321762d5ea..e3fb47f9e3d 100644
--- a/src/backend/access/hash/hashinsert.c
+++ b/src/backend/access/hash/hashinsert.c
@@ -88,7 +88,7 @@ restart_insert:
 										  &usedmetap);
 	Assert(usedmetap != NULL);
 
-	CheckForSerializableConflictIn(rel, NULL, buf);
+	CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(buf));
 
 	/* remember the primary bucket buffer to release the pin on it at end. */
 	bucket_buf = buf;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 94309949fac..8475c4b0168 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -40,6 +40,7 @@
 #include "access/multixact.h"
 #include "access/parallel.h"
 #include "access/relscan.h"
+#include "access/subtrans.h"
 #include "access/sysattr.h"
 #include "access/tableam.h"
 #include "access/transam.h"
@@ -446,7 +447,7 @@ heapgetpage(TableScanDesc sscan, BlockNumber page)
 			else
 				valid = HeapTupleSatisfiesVisibility(&loctup, snapshot, buffer);
 
-			CheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
+			HeapCheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
 											&loctup, buffer, snapshot);
 
 			if (valid)
@@ -668,7 +669,7 @@ heapgettup(HeapScanDesc scan,
 													 snapshot,
 													 scan->rs_cbuf);
 
-				CheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
+				HeapCheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
 												tuple, scan->rs_cbuf,
 												snapshot);
 
@@ -1477,9 +1478,10 @@ heap_fetch(Relation relation,
 	valid = HeapTupleSatisfiesVisibility(tuple, snapshot, buffer);
 
 	if (valid)
-		PredicateLockTuple(relation, tuple, snapshot);
+		PredicateLockTID(relation, &(tuple->t_self), snapshot,
+						 HeapTupleHeaderGetXmin(tuple->t_data));
 
-	CheckForSerializableConflictOut(valid, relation, tuple, buffer, snapshot);
+	HeapCheckForSerializableConflictOut(valid, relation, tuple, buffer, snapshot);
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
@@ -1613,7 +1615,7 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 
 			/* If it's visible per the snapshot, we must return it */
 			valid = HeapTupleSatisfiesVisibility(heapTuple, snapshot, buffer);
-			CheckForSerializableConflictOut(valid, relation, heapTuple,
+			HeapCheckForSerializableConflictOut(valid, relation, heapTuple,
 											buffer, snapshot);
 			/* reset to original, non-redirected, tid */
 			heapTuple->t_self = *tid;
@@ -1621,7 +1623,8 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 			if (valid)
 			{
 				ItemPointerSetOffsetNumber(tid, offnum);
-				PredicateLockTuple(relation, heapTuple, snapshot);
+				PredicateLockTID(relation, &(heapTuple)->t_self, snapshot,
+								 HeapTupleHeaderGetXmin(heapTuple->t_data));
 				if (all_dead)
 					*all_dead = false;
 				return true;
@@ -1755,7 +1758,7 @@ heap_get_latest_tid(TableScanDesc sscan,
 		 * candidate.
 		 */
 		valid = HeapTupleSatisfiesVisibility(&tp, snapshot, buffer);
-		CheckForSerializableConflictOut(valid, relation, &tp, buffer, snapshot);
+		HeapCheckForSerializableConflictOut(valid, relation, &tp, buffer, snapshot);
 		if (valid)
 			*tid = ctid;
 
@@ -1910,7 +1913,7 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 	 * lock "gaps" as index page locks do.  So we don't need to specify a
 	 * buffer when making the call, which makes for a faster check.
 	 */
-	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
+	CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber);
 
 	/* NO EREPORT(ERROR) from here till changes are logged */
 	START_CRIT_SECTION();
@@ -2164,7 +2167,7 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 	 * lock "gaps" as index page locks do.  So we don't need to specify a
 	 * buffer when making the call, which makes for a faster check.
 	 */
-	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
+	CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber);
 
 	ndone = 0;
 	while (ndone < ntuples)
@@ -2355,7 +2358,7 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 	 * lock "gaps" as index page locks do.  So we don't need to specify a
 	 * buffer when making the call.
 	 */
-	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
+	CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber);
 
 	/*
 	 * If tuples are cachable, mark them for invalidation from the caches in
@@ -2669,7 +2672,7 @@ l1:
 	 * being visible to the scan (i.e., an exclusive buffer content lock is
 	 * continuously held from this point until the tuple delete is visible).
 	 */
-	CheckForSerializableConflictIn(relation, &tp, buffer);
+	CheckForSerializableConflictIn(relation, tid, BufferGetBlockNumber(buffer));
 
 	/* replace cid with a combo cid if necessary */
 	HeapTupleHeaderAdjustCmax(tp.t_data, &cid, &iscombo);
@@ -3584,7 +3587,7 @@ l2:
 	 * will include checking the relation level, there is no benefit to a
 	 * separate check for the new tuple.
 	 */
-	CheckForSerializableConflictIn(relation, &oldtup, buffer);
+	CheckForSerializableConflictIn(relation, otid, BufferGetBlockNumber(buffer));
 
 	/*
 	 * At this point newbuf and buffer are both pinned and locked, and newbuf
@@ -9044,3 +9047,80 @@ heap_mask(char *pagedata, BlockNumber blkno)
 		}
 	}
 }
+
+/*
+ * HeapCheckForSerializableConflictOut
+ *		We are reading a tuple which has been modified.  If it is visible to
+ *		us but has been deleted, that indicates a rw-conflict out.  If it's
+ *		not visible and was created by a concurrent (overlapping)
+ *		serializable transaction, that is also a rw-conflict out,
+ *
+ * We will determine the top level xid of the writing transaction with which
+ * we may be in conflict, and check for overlap with our own transaction.
+ * If the transactions overlap (i.e., they cannot see each other's writes),
+ * then we have a conflict out.
+ *
+ * This function should be called just about anywhere in heapam.c where a
+ * tuple has been read. The caller must hold at least a shared lock on the
+ * buffer, because this function might set hint bits on the tuple. There is
+ * currently no known reason to call this function from an index AM.
+ */
+void
+HeapCheckForSerializableConflictOut(bool visible, Relation relation,
+									HeapTuple tuple, Buffer buffer,
+									Snapshot snapshot)
+{
+	TransactionId xid;
+	HTSV_Result htsvResult;
+
+	if (!CheckForSerializableConflictOutNeeded(relation, snapshot))
+		return;
+
+	/*
+	 * Check to see whether the tuple has been written to by a concurrent
+	 * transaction, either to create it not visible to us, or to delete it
+	 * while it is visible to us.  The "visible" bool indicates whether the
+	 * tuple is visible to us, while HeapTupleSatisfiesVacuum checks what else
+	 * is going on with it.
+	 */
+	htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
+	switch (htsvResult)
+	{
+		case HEAPTUPLE_LIVE:
+			if (visible)
+				return;
+			xid = HeapTupleHeaderGetXmin(tuple->t_data);
+			break;
+		case HEAPTUPLE_RECENTLY_DEAD:
+			if (!visible)
+				return;
+			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
+			break;
+		case HEAPTUPLE_DELETE_IN_PROGRESS:
+			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
+			break;
+		case HEAPTUPLE_INSERT_IN_PROGRESS:
+			xid = HeapTupleHeaderGetXmin(tuple->t_data);
+			break;
+		case HEAPTUPLE_DEAD:
+			return;
+		default:
+
+			/*
+			 * The only way to get to this default clause is if a new value is
+			 * added to the enum type without adding it to this switch
+			 * statement.  That's a bug, so elog.
+			 */
+			elog(ERROR, "unrecognized return value from HeapTupleSatisfiesVacuum: %u", htsvResult);
+
+			/*
+			 * In spite of having all enum values covered and calling elog on
+			 * this default, some compilers think this is a code path which
+			 * allows xid to be used below without initialization. Silence
+			 * that warning.
+			 */
+			xid = InvalidTransactionId;
+	}
+
+	return CheckForSerializableConflictOut(relation, xid, true, snapshot);
+}
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 09bc6fe98a7..bad26bb934b 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2164,9 +2164,10 @@ heapam_scan_bitmap_next_block(TableScanDesc scan,
 			if (valid)
 			{
 				hscan->rs_vistuples[ntup++] = offnum;
-				PredicateLockTuple(scan->rs_rd, &loctup, snapshot);
+				PredicateLockTID(scan->rs_rd, &loctup.t_self, snapshot,
+								 HeapTupleHeaderGetXmin(loctup.t_data));
 			}
-			CheckForSerializableConflictOut(valid, scan->rs_rd, &loctup,
+			HeapCheckForSerializableConflictOut(valid, scan->rs_rd, &loctup,
 											buffer, snapshot);
 		}
 	}
@@ -2354,7 +2355,7 @@ heapam_scan_sample_next_tuple(TableScanDesc scan, SampleScanState *scanstate,
 
 			/* in pagemode, heapgetpage did this for us */
 			if (!pagemode)
-				CheckForSerializableConflictOut(visible, scan->rs_rd, tuple,
+				HeapCheckForSerializableConflictOut(visible, scan->rs_rd, tuple,
 												hscan->rs_cbuf, scan->rs_snapshot);
 
 			/* Try next tuple from same page. */
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 28edd4aca76..6f1093ae383 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -180,8 +180,8 @@ index_insert(Relation indexRelation,
 
 	if (!(indexRelation->rd_indam->ampredlocks))
 		CheckForSerializableConflictIn(indexRelation,
-									   (HeapTuple) NULL,
-									   InvalidBuffer);
+									   (ItemPointer) NULL,
+									   InvalidBlockNumber);
 
 	return indexRelation->rd_indam->aminsert(indexRelation, values, isnull,
 											 heap_t_ctid, heapRelation,
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index 602f8849d4a..22fcb26a069 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -290,7 +290,7 @@ top:
 		 * checkingunique and !heapkeyspace cases, but it's okay to use the
 		 * first page the value could be on (with scantid omitted) instead.
 		 */
-		CheckForSerializableConflictIn(rel, NULL, insertstate.buf);
+		CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(insertstate.buf));
 
 		/*
 		 * Do the insertion.  Note that insertstate contains cached binary
@@ -533,7 +533,7 @@ _bt_check_unique(Relation rel, BTInsertState insertstate, Relation heapRel,
 					 * otherwise be masked by this unique constraint
 					 * violation.
 					 */
-					CheckForSerializableConflictIn(rel, NULL, insertstate->buf);
+					CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(insertstate->buf));
 
 					/*
 					 * This is a definite conflict.  Break the tuple down into
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 2d709420c3d..b4e5b893e4f 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -163,8 +163,8 @@
  *		PredicateLockRelation(Relation relation, Snapshot snapshot)
  *		PredicateLockPage(Relation relation, BlockNumber blkno,
  *						Snapshot snapshot)
- *		PredicateLockTuple(Relation relation, HeapTuple tuple,
- *						Snapshot snapshot)
+ *		PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
+ *						 TransactionId insert_xid)
  *		PredicateLockPageSplit(Relation relation, BlockNumber oldblkno,
  *							   BlockNumber newblkno)
  *		PredicateLockPageCombine(Relation relation, BlockNumber oldblkno,
@@ -173,11 +173,10 @@
  *		ReleasePredicateLocks(bool isCommit, bool isReadOnlySafe)
  *
  * conflict detection (may also trigger rollback)
- *		CheckForSerializableConflictOut(bool visible, Relation relation,
- *										HeapTupleData *tup, Buffer buffer,
+ *		CheckForSerializableConflictOut(Relation relation, TransactionId xid,
  *										Snapshot snapshot)
- *		CheckForSerializableConflictIn(Relation relation, HeapTupleData *tup,
- *									   Buffer buffer)
+ *		CheckForSerializableConflictIn(Relation relation, ItemPointer tid,
+ *									   BlockNumber blkno)
  *		CheckTableForSerializableConflictIn(Relation relation)
  *
  * final rollback checking
@@ -193,8 +192,6 @@
 
 #include "postgres.h"
 
-#include "access/heapam.h"
-#include "access/htup_details.h"
 #include "access/parallel.h"
 #include "access/slru.h"
 #include "access/subtrans.h"
@@ -2538,37 +2535,34 @@ PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot)
 }
 
 /*
- *		PredicateLockTuple
+ *		PredicateLockTID
  *
  * Gets a predicate lock at the tuple level.
  * Skip if not in full serializable transaction isolation level.
  * Skip if this is a temporary table.
  */
 void
-PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot)
+PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
+				 TransactionId insert_xid)
 {
 	PREDICATELOCKTARGETTAG tag;
-	ItemPointer tid;
-	TransactionId targetxmin;
 
 	if (!SerializationNeededForRead(relation, snapshot))
 		return;
 
 	/*
-	 * If it's a heap tuple, return if this xact wrote it.
+	 * Return if this xact wrote it.
 	 */
 	if (relation->rd_index == NULL)
 	{
 		TransactionId myxid;
 
-		targetxmin = HeapTupleHeaderGetXmin(tuple->t_data);
-
 		myxid = GetTopTransactionIdIfAny();
 		if (TransactionIdIsValid(myxid))
 		{
-			if (TransactionIdFollowsOrEquals(targetxmin, TransactionXmin))
+			if (TransactionIdFollowsOrEquals(insert_xid, TransactionXmin))
 			{
-				TransactionId xid = SubTransGetTopmostTransaction(targetxmin);
+				TransactionId xid = SubTransGetTopmostTransaction(insert_xid);
 
 				if (TransactionIdEquals(xid, myxid))
 				{
@@ -2591,7 +2585,6 @@ PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot)
 	if (PredicateLockExists(&tag))
 		return;
 
-	tid = &(tuple->t_self);
 	SET_PREDICATELOCKTARGETTAG_TUPLE(tag,
 									 relation->rd_node.dbNode,
 									 relation->rd_id,
@@ -4036,33 +4029,44 @@ XidIsConcurrent(TransactionId xid)
 	return false;
 }
 
+bool
+CheckForSerializableConflictOutNeeded(Relation relation, Snapshot snapshot)
+{
+	if (!SerializationNeededForRead(relation, snapshot))
+		return false;
+
+	/* Check if someone else has already decided that we need to die */
+	if (SxactIsDoomed(MySerializableXact))
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("could not serialize access due to read/write dependencies among transactions"),
+				 errdetail_internal("Reason code: Canceled on identification as a pivot, during conflict out checking."),
+				 errhint("The transaction might succeed if retried.")));
+	}
+
+	return true;
+}
+
 /*
  * CheckForSerializableConflictOut
  *		We are reading a tuple which has been modified.  If it is visible to
  *		us but has been deleted, that indicates a rw-conflict out.  If it's
  *		not visible and was created by a concurrent (overlapping)
- *		serializable transaction, that is also a rw-conflict out,
+ *		serializable transaction, that is also a rw-conflict out.
  *
  * We will determine the top level xid of the writing transaction with which
  * we may be in conflict, and check for overlap with our own transaction.
  * If the transactions overlap (i.e., they cannot see each other's writes),
  * then we have a conflict out.
- *
- * This function should be called just about anywhere in heapam.c where a
- * tuple has been read. The caller must hold at least a shared lock on the
- * buffer, because this function might set hint bits on the tuple. There is
- * currently no known reason to call this function from an index AM.
  */
 void
-CheckForSerializableConflictOut(bool visible, Relation relation,
-								HeapTuple tuple, Buffer buffer,
-								Snapshot snapshot)
+CheckForSerializableConflictOut(Relation relation, TransactionId xid,
+								bool needSubtransHandling, Snapshot snapshot)
 {
-	TransactionId xid;
 	SERIALIZABLEXIDTAG sxidtag;
 	SERIALIZABLEXID *sxid;
 	SERIALIZABLEXACT *sxact;
-	HTSV_Result htsvResult;
 
 	if (!SerializationNeededForRead(relation, snapshot))
 		return;
@@ -4077,51 +4081,6 @@ CheckForSerializableConflictOut(bool visible, Relation relation,
 				 errhint("The transaction might succeed if retried.")));
 	}
 
-	/*
-	 * Check to see whether the tuple has been written to by a concurrent
-	 * transaction, either to create it not visible to us, or to delete it
-	 * while it is visible to us.  The "visible" bool indicates whether the
-	 * tuple is visible to us, while HeapTupleSatisfiesVacuum checks what else
-	 * is going on with it.
-	 */
-	htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
-	switch (htsvResult)
-	{
-		case HEAPTUPLE_LIVE:
-			if (visible)
-				return;
-			xid = HeapTupleHeaderGetXmin(tuple->t_data);
-			break;
-		case HEAPTUPLE_RECENTLY_DEAD:
-			if (!visible)
-				return;
-			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
-			break;
-		case HEAPTUPLE_DELETE_IN_PROGRESS:
-			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
-			break;
-		case HEAPTUPLE_INSERT_IN_PROGRESS:
-			xid = HeapTupleHeaderGetXmin(tuple->t_data);
-			break;
-		case HEAPTUPLE_DEAD:
-			return;
-		default:
-
-			/*
-			 * The only way to get to this default clause is if a new value is
-			 * added to the enum type without adding it to this switch
-			 * statement.  That's a bug, so elog.
-			 */
-			elog(ERROR, "unrecognized return value from HeapTupleSatisfiesVacuum: %u", htsvResult);
-
-			/*
-			 * In spite of having all enum values covered and calling elog on
-			 * this default, some compilers think this is a code path which
-			 * allows xid to be used below without initialization. Silence
-			 * that warning.
-			 */
-			xid = InvalidTransactionId;
-	}
 	Assert(TransactionIdIsValid(xid));
 	Assert(TransactionIdFollowsOrEquals(xid, TransactionXmin));
 
@@ -4131,11 +4090,15 @@ CheckForSerializableConflictOut(bool visible, Relation relation,
 	 */
 	if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
 		return;
-	xid = SubTransGetTopmostTransaction(xid);
-	if (TransactionIdPrecedes(xid, TransactionXmin))
-		return;
-	if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
-		return;
+
+	if (needSubtransHandling)
+	{
+		xid = SubTransGetTopmostTransaction(xid);
+		if (TransactionIdPrecedes(xid, TransactionXmin))
+			return;
+		if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
+			return;
+	}
 
 	/*
 	 * Find sxact or summarized info for the top level xid.
@@ -4439,8 +4402,7 @@ CheckTargetForConflictsIn(PREDICATELOCKTARGETTAG *targettag)
  * tuple itself.
  */
 void
-CheckForSerializableConflictIn(Relation relation, HeapTuple tuple,
-							   Buffer buffer)
+CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber blkno)
 {
 	PREDICATELOCKTARGETTAG targettag;
 
@@ -4470,22 +4432,22 @@ CheckForSerializableConflictIn(Relation relation, HeapTuple tuple,
 	 * It is not possible to take and hold a lock across the checks for all
 	 * granularities because each target could be in a separate partition.
 	 */
-	if (tuple != NULL)
+	if (tid != NULL)
 	{
 		SET_PREDICATELOCKTARGETTAG_TUPLE(targettag,
 										 relation->rd_node.dbNode,
 										 relation->rd_id,
-										 ItemPointerGetBlockNumber(&(tuple->t_self)),
-										 ItemPointerGetOffsetNumber(&(tuple->t_self)));
+										 ItemPointerGetBlockNumber(tid),
+										 ItemPointerGetOffsetNumber(tid));
 		CheckTargetForConflictsIn(&targettag);
 	}
 
-	if (BufferIsValid(buffer))
+	if (blkno != InvalidBlockNumber)
 	{
 		SET_PREDICATELOCKTARGETTAG_PAGE(targettag,
 										relation->rd_node.dbNode,
 										relation->rd_id,
-										BufferGetBlockNumber(buffer));
+										blkno);
 		CheckTargetForConflictsIn(&targettag);
 	}
 
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc96..390023c0a55 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -217,5 +217,7 @@ extern bool ResolveCminCmaxDuringDecoding(struct HTAB *tuplecid_data,
 										  HeapTuple htup,
 										  Buffer buffer,
 										  CommandId *cmin, CommandId *cmax);
+extern void HeapCheckForSerializableConflictOut(bool valid, Relation relation, HeapTuple tuple,
+												Buffer buffer, Snapshot snapshot);
 
 #endif							/* HEAPAM_H */
diff --git a/src/include/storage/predicate.h b/src/include/storage/predicate.h
index 376245ecd70..f5fe16e1815 100644
--- a/src/include/storage/predicate.h
+++ b/src/include/storage/predicate.h
@@ -57,16 +57,18 @@ extern void SetSerializableTransactionSnapshot(Snapshot snapshot,
 extern void RegisterPredicateLockingXid(TransactionId xid);
 extern void PredicateLockRelation(Relation relation, Snapshot snapshot);
 extern void PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot);
-extern void PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot);
+extern void PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
+							 TransactionId insert_xid);
 extern void PredicateLockPageSplit(Relation relation, BlockNumber oldblkno, BlockNumber newblkno);
 extern void PredicateLockPageCombine(Relation relation, BlockNumber oldblkno, BlockNumber newblkno);
 extern void TransferPredicateLocksToHeapRelation(Relation relation);
 extern void ReleasePredicateLocks(bool isCommit, bool isReadOnlySafe);
 
 /* conflict detection (may also trigger rollback) */
-extern void CheckForSerializableConflictOut(bool valid, Relation relation, HeapTuple tuple,
-											Buffer buffer, Snapshot snapshot);
-extern void CheckForSerializableConflictIn(Relation relation, HeapTuple tuple, Buffer buffer);
+extern bool CheckForSerializableConflictOutNeeded(Relation relation, Snapshot snapshot);
+extern void CheckForSerializableConflictOut(Relation relation, TransactionId xid,
+											bool needSubtransHandling, Snapshot snapshot);
+extern void CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber blkno);
 extern void CheckTableForSerializableConflictIn(Relation relation);
 
 /* final rollback checking */
-- 
2.19.1

#9Andres Freund
andres@anarazel.de
In reply to: Ashwin Agrawal (#7)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Hi,

On 2019-07-31 12:37:58 -0700, Ashwin Agrawal wrote:

On Wed, Jul 31, 2019 at 10:55 AM Andres Freund <andres@anarazel.de> wrote:

Hi,

On 2019-07-31 10:42:50 -0700, Ashwin Agrawal wrote:

On Tue, Jul 30, 2019 at 2:58 PM Thomas Munro <thomas.munro@gmail.com>

wrote:

On Tue, Jun 25, 2019 at 6:02 AM Andres Freund <andres@anarazel.de>

wrote:

- CheckForSerializableConflictOut() no more takes HeapTuple nor
buffer, instead just takes xid. Push heap specific parts from
CheckForSerializableConflictOut() into its own function
HeapCheckForSerializableConflictOut() which calls
CheckForSerializableConflictOut(). The alternative option could

be

CheckForSerializableConflictOut() take callback function and
callback arguments, which gets called if required after

performing

prechecks. Though currently I fell AM having its own wrapper to
perform AM specific task and then calling
CheckForSerializableConflictOut() is fine.

I think it's right to move the xid handling out of
CheckForSerializableConflictOut(). But I think we also ought to move

the

subtransaction handling out of the function - e.g. zheap doesn't
want/need that.

Thoughts on this Ashwin?

I think the only part its doing for sub-transaction is
SubTransGetTopmostTransaction(xid). If xid passed to this function is
already top most transaction which is case for zheap and zedstore, then
there is no downside to keeping that code here in common place.

Well, it's far from a cheap function. It'll do unnecessary on-disk
lookups in many cases. I'd call that quite a downside.

Okay, agree, its costly function and better to avoid the call if possible.

Instead of moving the handling out of the function, how do feel about
adding boolean isTopTransactionId argument to function
CheckForSerializableConflictOut(). The AMs, which implicitly know, only
pass top transaction Id to this function, can pass true and avoid the
function call to SubTransGetTopmostTransaction(xid). With this
subtransaction code remains in generic place and AMs intending to use it
continue to leverage the common code, plus explicitly clarifies the
behavior as well.

Looking at the code as of master, we currently have:

- PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure
out a whether the tuple has been locked by the current
transaction. That check afaict just should be
TransactionIdIsCurrentTransactionId(), without all the other
stuff that's done today.

TransactionIdIsCurrentTransactionId() imo ought to be optimized to
always check for the top level transactionid first - that's a good bet
today, but even moreso for the upcoming AMs that won't have separate
xids for subtransactions. Alternatively we shouldn't make that a
binary search for each subtrans level, but just have a small
simplehash hashtable for xids.

- CheckForSerializableConflictOut() wants to get the toplevel xid for
the tuple, because that's the one the predicate hashtable stores.

In your patch you've already moved the HTSV() call etc out of
CheckForSerializableConflictOut(). I'm somewhat inclined to think that
the SubTransGetTopmostTransaction() call ought to go along with that.
I don't really think that belongs in predicate.c, especially if
most/all new AMs don't use subtransaction ids.

The only downside is that currently the
TransactionIdEquals(xid, GetTopTransactionIdIfAny()) check
avoids the SubTransGetTopmostTransaction() check.

But again, the better fix for that seems to be to improve the generic
code. As written the check won't prevent a subtrans lookup for heap
when subtransactions are in use, and it's IME pretty common for tuples
to get looked at again in the transaction that has created them. So
I'm somewhat inclined to think that SubTransGetTopmostTransaction()
should have a fast-path for the current transaction - probably just
employing TransactionIdIsCurrentTransactionId().

I don't really see what we gain by having the subtrans handling in the
predicate code. Especially given that we've already moved the HTSV()
handling out, it seems architecturally the wrong place to me - but I
admit that that's a fuzzy argument. The relevant mapping should be one
line in the caller.

I wonder if it'd be wroth to combine the
TransactionIdIsCurrentTransactionId() calls in the heap cases that
currently do both, PredicateLockTuple() and
HeapCheckForSerializableConflictOut(). The heap_fetch() case probably
isn't commonly that hot a pathq, but heap_hot_search_buffer() is.

Minor notes:
- I don't think 'insert_xid' is necessarily great - it could also be the
  updating xid etc. And while you can argue that an update is an insert
  in the current heap, that's not the case for future AMs.
- to me
@@ -1621,7 +1622,8 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 			if (valid)
 			{
 				ItemPointerSetOffsetNumber(tid, offnum);
-				PredicateLockTuple(relation, heapTuple, snapshot);
+				PredicateLockTID(relation, &(heapTuple)->t_self, snapshot,
+								 HeapTupleHeaderGetXmin(heapTuple->t_data));
 				if (all_dead)
 					*all_dead = false;
 				return true;

What are those parens - as placed they can't do anything. Did you
intend to write &(heapTuple->t_self)? Even that is pretty superfluous,
but it at least clarifies the precedence.

I'm also a bit confused why we don't need to pass in the offset of the
current tuple, rather than the HOT root tuple here. That's not related
to this patch. But aren't we locking the wrong tuple here, in case of
HOT?

- I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a
portion of it's code as a static inline. In particular, it's a shame
that we currently perform external function calls at quite the
frequency when serializable isn't even in use.

Greetings,

Andres Freund

#10Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Andres Freund (#9)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

On Thu, Aug 1, 2019 at 2:36 AM Andres Freund <andres@anarazel.de> wrote:

I think the only part its doing for sub-transaction is
SubTransGetTopmostTransaction(xid). If xid passed to this function is
already top most transaction which is case for zheap and zedstore, then
there is no downside to keeping that code here in common place.

Well, it's far from a cheap function. It'll do unnecessary on-disk
lookups in many cases. I'd call that quite a downside.

Okay, agree, its costly function and better to avoid the call if possible.

Instead of moving the handling out of the function, how do feel about
adding boolean isTopTransactionId argument to function
CheckForSerializableConflictOut(). The AMs, which implicitly know, only
pass top transaction Id to this function, can pass true and avoid the
function call to SubTransGetTopmostTransaction(xid). With this
subtransaction code remains in generic place and AMs intending to use it
continue to leverage the common code, plus explicitly clarifies the
behavior as well.

Looking at the code as of master, we currently have:

- PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure
out a whether the tuple has been locked by the current
transaction. That check afaict just should be
TransactionIdIsCurrentTransactionId(), without all the other
stuff that's done today.

Yeah. this is the only part where predicate locking uses the subxids.
Since, predicate locking always use the top xid, IMHO, it'll be good
to make this api independent of subxids.

TransactionIdIsCurrentTransactionId() imo ought to be optimized to
always check for the top level transactionid first - that's a good bet
today, but even moreso for the upcoming AMs that won't have separate
xids for subtransactions. Alternatively we shouldn't make that a
binary search for each subtrans level, but just have a small
simplehash hashtable for xids.

A check for top transaction id first and usage of simple sound like
good optimizations. But, I'm not sure whether these changes should be
part of this patch or a separate one.

--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

#11Ashwin Agrawal
aagrawal@pivotal.io
In reply to: Andres Freund (#9)
3 attachment(s)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

On Wed, Jul 31, 2019 at 2:06 PM Andres Freund <andres@anarazel.de> wrote:

Looking at the code as of master, we currently have:

Super awesome feedback and insights, thank you!

- PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure

out a whether the tuple has been locked by the current
transaction. That check afaict just should be
TransactionIdIsCurrentTransactionId(), without all the other
stuff that's done today.

Agree. v1-0002 patch attached does that now. Please let me know if that's
what you meant.

TransactionIdIsCurrentTransactionId() imo ought to be optimized to

always check for the top level transactionid first - that's a good bet
today, but even moreso for the upcoming AMs that won't have separate
xids for subtransactions. Alternatively we shouldn't make that a
binary search for each subtrans level, but just have a small
simplehash hashtable for xids.

v1-0001 patch checks for GetTopTransactionIdIfAny() first in
TransactionIdIsCurrentTransactionId() which seems yes better in general and
more for future. That mostly meets the needs for current discussion.

The alternative of not using binary search seems bigger refactoring and
should be handled as separate optimization exercise outside of this thread.

- CheckForSerializableConflictOut() wants to get the toplevel xid for
the tuple, because that's the one the predicate hashtable stores.

In your patch you've already moved the HTSV() call etc out of
CheckForSerializableConflictOut(). I'm somewhat inclined to think that
the SubTransGetTopmostTransaction() call ought to go along with that.
I don't really think that belongs in predicate.c, especially if
most/all new AMs don't use subtransaction ids.

The only downside is that currently the
TransactionIdEquals(xid, GetTopTransactionIdIfAny()) check
avoids the SubTransGetTopmostTransaction() check.

But again, the better fix for that seems to be to improve the generic
code. As written the check won't prevent a subtrans lookup for heap
when subtransactions are in use, and it's IME pretty common for tuples
to get looked at again in the transaction that has created them. So
I'm somewhat inclined to think that SubTransGetTopmostTransaction()
should have a fast-path for the current transaction - probably just
employing TransactionIdIsCurrentTransactionId().

That optimization, as Kuntal also mentioned, seems something which can be
done on-top afterwards on current patch.

I don't really see what we gain by having the subtrans handling in the
predicate code. Especially given that we've already moved the HTSV()
handling out, it seems architecturally the wrong place to me - but I
admit that that's a fuzzy argument. The relevant mapping should be one
line in the caller.

Okay, I moved the sub transaction handling out of
CheckForSerializableConflictOut() and have it along side HTSV() now.

The reason I felt leaving subtransaction handling in generic place, was it
might be premature to thing no future AM will need it. Plus, all
serializable function api's having same expectations is easier. Like
PredicateLockTuple() can be passed top or subtransaction id and it can
handle it but with the change CheckForSerializableConflictOut() only be
feed top transaction ID. But its fine and can see the point of AM needing
it can easily get top transaction ID and feed it as heap.

I wonder if it'd be wroth to combine the
TransactionIdIsCurrentTransactionId() calls in the heap cases that
currently do both, PredicateLockTuple() and
HeapCheckForSerializableConflictOut(). The heap_fetch() case probably
isn't commonly that hot a pathq, but heap_hot_search_buffer() is.

Maybe, will give thought to it separate from the current patch.

Minor notes:
- I don't think 'insert_xid' is necessarily great - it could also be the
updating xid etc. And while you can argue that an update is an insert
in the current heap, that's not the case for future AMs.
- to me
@@ -1621,7 +1622,8 @@ heap_hot_search_buffer(ItemPointer tid, Relation
relation, Buffer buffer,
if (valid)
{
ItemPointerSetOffsetNumber(tid, offnum);
-                               PredicateLockTuple(relation, heapTuple,
snapshot);
+                               PredicateLockTID(relation,
&(heapTuple)->t_self, snapshot,
+
HeapTupleHeaderGetXmin(heapTuple->t_data));
if (all_dead)
*all_dead = false;
return true;

What are those parens - as placed they can't do anything. Did you
intend to write &(heapTuple->t_self)? Even that is pretty superfluous,
but it at least clarifies the precedence.

Fixed. No idea what I was thinking there, mostly feel I intended to have it
as like &(heapTuple->t_self).

I'm also a bit confused why we don't need to pass in the offset of the

current tuple, rather than the HOT root tuple here. That's not related
to this patch. But aren't we locking the wrong tuple here, in case of
HOT?

Yes, root is being locked here instead of the HOT. But I don't have full
context on the same. If we wish to fix it though, can be easily done now
with the patch by passing "tid" instead of &(heapTuple->t_self).

- I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a

portion of it's code as a static inline. In particular, it's a shame
that we currently perform external function calls at quite the
frequency when serializable isn't even in use.

I am not sure on portion of the code part? SerializationNeededForRead() is
static inline function in C file. Can't inline
CheckForSerializableConflictOutNeeded() without moving
SerializationNeededForRead() and some other variables to header file.
CheckForSerializableConflictOut() wasn't inline either, so a function call
was performed earlier as well when serializable isn't even in use.

I understand that with refactor, HeapCheckForSerializableConflictOut() is
called which calls CheckForSerializableConflictOutNeeded(). If that's the
problem, for addressing the same, I had proposed alternative way to
refactor. CheckForSerializableConflictOut() can take callback function and
void* callback argument for AM specific check instead. So, the flow would
be AM calling CheckForSerializableConflictOut() as today and only if
serializable in use will invoke the callback to check with AM if more work
should be performed or not. Essentially
HeapCheckForSerializableConflictOut() will become callback function
instead. Due to void* callback argument aspect I didn't like that solution
and felt AM performing checks and calling CheckForSerializableConflictOut()
seems more straight forward.

Attachments:

v1-0001-Optimize-TransactionIdIsCurrentTransactionId.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Optimize-TransactionIdIsCurrentTransactionId.patchDownload
From 2516c592db97f285925667aafb34fc2de4286282 Mon Sep 17 00:00:00 2001
From: Ashwin Agrawal <aagrawal@pivotal.io>
Date: Fri, 2 Aug 2019 15:02:28 -0700
Subject: [PATCH v1 1/3] Optimize TransactionIdIsCurrentTransactionId()

If xid is current top transaction, can fast check for the same and
exit early. This should optmize for current heap but also works very
well for future AMs, which don't have separate xid for
subtransactions. This optimization was proposed by Andres.
---
 src/backend/access/transam/xact.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 1bbaeeebf4d..41952bc4d26 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -871,6 +871,9 @@ TransactionIdIsCurrentTransactionId(TransactionId xid)
 	if (!TransactionIdIsNormal(xid))
 		return false;
 
+	if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
+		return true;
+
 	/*
 	 * In parallel workers, the XIDs we must consider as current are stored in
 	 * ParallelCurrentXids rather than the transaction-state stack.  Note that
-- 
2.19.1

v1-0002-Optimize-PredicateLockTuple.patchtext/x-patch; charset=US-ASCII; name=v1-0002-Optimize-PredicateLockTuple.patchDownload
From 45ba97a206a8c8f04188fe10b9b73c0b79f5d263 Mon Sep 17 00:00:00 2001
From: Ashwin Agrawal <aagrawal@pivotal.io>
Date: Fri, 2 Aug 2019 15:21:16 -0700
Subject: [PATCH v1 2/3] Optimize PredicateLockTuple().

PredicateLockTuple fast exits if tuple was written by current
transaction, as for that case it already has the lock. This check can
be easily performed using TransactionIdIsCurrentTransactionId()
instead of using SubTransGetTopmostTransaction(). Since
TransactionIdIsCurrentTransactionId() is all in-memory operation,
makes this efficient compared to SubTransGetTopmostTransaction(),
which can hit the disk.

This simplification was proposed by Andres.
---
 src/backend/storage/lmgr/predicate.c | 22 ++++------------------
 1 file changed, 4 insertions(+), 18 deletions(-)

diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 2d709420c3d..1417ba37441 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -2559,24 +2559,10 @@ PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot)
 	 */
 	if (relation->rd_index == NULL)
 	{
-		TransactionId myxid;
-
-		targetxmin = HeapTupleHeaderGetXmin(tuple->t_data);
-
-		myxid = GetTopTransactionIdIfAny();
-		if (TransactionIdIsValid(myxid))
-		{
-			if (TransactionIdFollowsOrEquals(targetxmin, TransactionXmin))
-			{
-				TransactionId xid = SubTransGetTopmostTransaction(targetxmin);
-
-				if (TransactionIdEquals(xid, myxid))
-				{
-					/* We wrote it; we already have a write lock. */
-					return;
-				}
-			}
-		}
+		/* If we wrote it; we already have a write lock. */
+		if (TransactionIdIsCurrentTransactionId(
+				HeapTupleHeaderGetXmin(tuple->t_data)))
+			return;
 	}
 
 	/*
-- 
2.19.1

v1-0003-Remove-HeapTuple-dependency-for-predicate-locking.patchtext/x-patch; charset=US-ASCII; name=v1-0003-Remove-HeapTuple-dependency-for-predicate-locking.patchDownload
From f4cf0dbc622f434fb3f1d46d8bc02fcfd0cc1b21 Mon Sep 17 00:00:00 2001
From: Ashwin Agrawal <aagrawal@pivotal.io>
Date: Fri, 2 Aug 2019 11:44:21 -0700
Subject: [PATCH v1 3/3] Remove HeapTuple dependency for predicate locking
 functions.

Following changes help to make predicate locking functions generic and
not tied to particular AM.

- PredicateLockTuple() is renamed to PredicateLockTID(). It takes
  ItemPointer and inserting transaction id now instead of HeapTuple.

- CheckForSerializableConflictIn() takes blocknum instead of buffer

- CheckForSerializableConflictOut() no more takes HeapTuple nor buffer
---
 src/backend/access/gin/ginbtree.c        |   2 +-
 src/backend/access/gin/ginfast.c         |   2 +-
 src/backend/access/gin/gininsert.c       |   4 +-
 src/backend/access/gist/gist.c           |   2 +-
 src/backend/access/hash/hashinsert.c     |   2 +-
 src/backend/access/heap/heapam.c         | 119 ++++++++++++++++++---
 src/backend/access/heap/heapam_handler.c |   7 +-
 src/backend/access/index/indexam.c       |   4 +-
 src/backend/access/nbtree/nbtinsert.c    |   4 +-
 src/backend/storage/lmgr/predicate.c     | 127 +++++++----------------
 src/include/access/heapam.h              |   2 +
 src/include/storage/predicate.h          |   9 +-
 12 files changed, 164 insertions(+), 120 deletions(-)

diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c
index 11a8ed7bbc2..e795375495b 100644
--- a/src/backend/access/gin/ginbtree.c
+++ b/src/backend/access/gin/ginbtree.c
@@ -89,7 +89,7 @@ ginFindLeafPage(GinBtree btree, bool searchMode,
 	stack->predictNumber = 1;
 
 	if (rootConflictCheck)
-		CheckForSerializableConflictIn(btree->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(btree->index, NULL, btree->rootBlkno);
 
 	for (;;)
 	{
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 439a91b3e61..d7b52476817 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -246,7 +246,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
 	 * tree, so it conflicts with all serializable scans.  All scans acquire a
 	 * predicate lock on the metabuffer to represent that.
 	 */
-	CheckForSerializableConflictIn(index, NULL, metabuffer);
+	CheckForSerializableConflictIn(index, NULL, GIN_METAPAGE_BLKNO);
 
 	if (collector->sumsize + collector->ntuples * sizeof(ItemIdData) > GinListPageSize)
 	{
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index 55eab146173..046a20a3d41 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -221,7 +221,7 @@ ginEntryInsert(GinState *ginstate,
 			return;
 		}
 
-		CheckForSerializableConflictIn(ginstate->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(ginstate->index, NULL, BufferGetBlockNumber(stack->buffer));
 		/* modify an existing leaf entry */
 		itup = addItemPointersToLeafTuple(ginstate, itup,
 										  items, nitem, buildStats, stack->buffer);
@@ -230,7 +230,7 @@ ginEntryInsert(GinState *ginstate,
 	}
 	else
 	{
-		CheckForSerializableConflictIn(ginstate->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(ginstate->index, NULL, BufferGetBlockNumber(stack->buffer));
 		/* no match, so construct a new leaf entry */
 		itup = buildFreshLeafTuple(ginstate, attnum, key, category,
 								   items, nitem, buildStats, stack->buffer);
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index e9ca4b82527..4db54a42761 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -1263,7 +1263,7 @@ gistinserttuples(GISTInsertState *state, GISTInsertStack *stack,
 	 * Check for any rw conflicts (in serializable isolation level) just
 	 * before we intend to modify the page
 	 */
-	CheckForSerializableConflictIn(state->r, NULL, stack->buffer);
+	CheckForSerializableConflictIn(state->r, NULL, BufferGetBlockNumber(stack->buffer));
 
 	/* Insert the tuple(s) to the page, splitting the page if necessary */
 	is_split = gistplacetopage(state->r, state->freespace, giststate,
diff --git a/src/backend/access/hash/hashinsert.c b/src/backend/access/hash/hashinsert.c
index 5321762d5ea..e3fb47f9e3d 100644
--- a/src/backend/access/hash/hashinsert.c
+++ b/src/backend/access/hash/hashinsert.c
@@ -88,7 +88,7 @@ restart_insert:
 										  &usedmetap);
 	Assert(usedmetap != NULL);
 
-	CheckForSerializableConflictIn(rel, NULL, buf);
+	CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(buf));
 
 	/* remember the primary bucket buffer to release the pin on it at end. */
 	bucket_buf = buf;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 94309949fac..936ad62764b 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -40,6 +40,7 @@
 #include "access/multixact.h"
 #include "access/parallel.h"
 #include "access/relscan.h"
+#include "access/subtrans.h"
 #include "access/sysattr.h"
 #include "access/tableam.h"
 #include "access/transam.h"
@@ -446,7 +447,7 @@ heapgetpage(TableScanDesc sscan, BlockNumber page)
 			else
 				valid = HeapTupleSatisfiesVisibility(&loctup, snapshot, buffer);
 
-			CheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
+			HeapCheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
 											&loctup, buffer, snapshot);
 
 			if (valid)
@@ -668,7 +669,7 @@ heapgettup(HeapScanDesc scan,
 													 snapshot,
 													 scan->rs_cbuf);
 
-				CheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
+				HeapCheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
 												tuple, scan->rs_cbuf,
 												snapshot);
 
@@ -1477,9 +1478,10 @@ heap_fetch(Relation relation,
 	valid = HeapTupleSatisfiesVisibility(tuple, snapshot, buffer);
 
 	if (valid)
-		PredicateLockTuple(relation, tuple, snapshot);
+		PredicateLockTID(relation, &(tuple->t_self), snapshot,
+						 HeapTupleHeaderGetXmin(tuple->t_data));
 
-	CheckForSerializableConflictOut(valid, relation, tuple, buffer, snapshot);
+	HeapCheckForSerializableConflictOut(valid, relation, tuple, buffer, snapshot);
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
@@ -1613,15 +1615,16 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 
 			/* If it's visible per the snapshot, we must return it */
 			valid = HeapTupleSatisfiesVisibility(heapTuple, snapshot, buffer);
-			CheckForSerializableConflictOut(valid, relation, heapTuple,
-											buffer, snapshot);
+			HeapCheckForSerializableConflictOut(valid, relation, heapTuple,
+												buffer, snapshot);
 			/* reset to original, non-redirected, tid */
 			heapTuple->t_self = *tid;
 
 			if (valid)
 			{
 				ItemPointerSetOffsetNumber(tid, offnum);
-				PredicateLockTuple(relation, heapTuple, snapshot);
+				PredicateLockTID(relation, &heapTuple->t_self, snapshot,
+								 HeapTupleHeaderGetXmin(heapTuple->t_data));
 				if (all_dead)
 					*all_dead = false;
 				return true;
@@ -1755,7 +1758,7 @@ heap_get_latest_tid(TableScanDesc sscan,
 		 * candidate.
 		 */
 		valid = HeapTupleSatisfiesVisibility(&tp, snapshot, buffer);
-		CheckForSerializableConflictOut(valid, relation, &tp, buffer, snapshot);
+		HeapCheckForSerializableConflictOut(valid, relation, &tp, buffer, snapshot);
 		if (valid)
 			*tid = ctid;
 
@@ -1910,7 +1913,7 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 	 * lock "gaps" as index page locks do.  So we don't need to specify a
 	 * buffer when making the call, which makes for a faster check.
 	 */
-	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
+	CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber);
 
 	/* NO EREPORT(ERROR) from here till changes are logged */
 	START_CRIT_SECTION();
@@ -2164,7 +2167,7 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 	 * lock "gaps" as index page locks do.  So we don't need to specify a
 	 * buffer when making the call, which makes for a faster check.
 	 */
-	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
+	CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber);
 
 	ndone = 0;
 	while (ndone < ntuples)
@@ -2355,7 +2358,7 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 	 * lock "gaps" as index page locks do.  So we don't need to specify a
 	 * buffer when making the call.
 	 */
-	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
+	CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber);
 
 	/*
 	 * If tuples are cachable, mark them for invalidation from the caches in
@@ -2669,7 +2672,7 @@ l1:
 	 * being visible to the scan (i.e., an exclusive buffer content lock is
 	 * continuously held from this point until the tuple delete is visible).
 	 */
-	CheckForSerializableConflictIn(relation, &tp, buffer);
+	CheckForSerializableConflictIn(relation, tid, BufferGetBlockNumber(buffer));
 
 	/* replace cid with a combo cid if necessary */
 	HeapTupleHeaderAdjustCmax(tp.t_data, &cid, &iscombo);
@@ -3584,7 +3587,7 @@ l2:
 	 * will include checking the relation level, there is no benefit to a
 	 * separate check for the new tuple.
 	 */
-	CheckForSerializableConflictIn(relation, &oldtup, buffer);
+	CheckForSerializableConflictIn(relation, otid, BufferGetBlockNumber(buffer));
 
 	/*
 	 * At this point newbuf and buffer are both pinned and locked, and newbuf
@@ -9044,3 +9047,93 @@ heap_mask(char *pagedata, BlockNumber blkno)
 		}
 	}
 }
+
+/*
+ * HeapCheckForSerializableConflictOut
+ *		We are reading a tuple which has been modified.  If it is visible to
+ *		us but has been deleted, that indicates a rw-conflict out.  If it's
+ *		not visible and was created by a concurrent (overlapping)
+ *		serializable transaction, that is also a rw-conflict out,
+ *
+ * We will determine the top level xid of the writing transaction with which
+ * we may be in conflict, and check for overlap with our own transaction.
+ * If the transactions overlap (i.e., they cannot see each other's writes),
+ * then we have a conflict out.
+ *
+ * This function should be called just about anywhere in heapam.c where a
+ * tuple has been read. The caller must hold at least a shared lock on the
+ * buffer, because this function might set hint bits on the tuple. There is
+ * currently no known reason to call this function from an index AM.
+ */
+void
+HeapCheckForSerializableConflictOut(bool visible, Relation relation,
+									HeapTuple tuple, Buffer buffer,
+									Snapshot snapshot)
+{
+	TransactionId xid;
+	HTSV_Result htsvResult;
+
+	if (!CheckForSerializableConflictOutNeeded(relation, snapshot))
+		return;
+
+	/*
+	 * Check to see whether the tuple has been written to by a concurrent
+	 * transaction, either to create it not visible to us, or to delete it
+	 * while it is visible to us.  The "visible" bool indicates whether the
+	 * tuple is visible to us, while HeapTupleSatisfiesVacuum checks what else
+	 * is going on with it.
+	 */
+	htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
+	switch (htsvResult)
+	{
+		case HEAPTUPLE_LIVE:
+			if (visible)
+				return;
+			xid = HeapTupleHeaderGetXmin(tuple->t_data);
+			break;
+		case HEAPTUPLE_RECENTLY_DEAD:
+			if (!visible)
+				return;
+			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
+			break;
+		case HEAPTUPLE_DELETE_IN_PROGRESS:
+			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
+			break;
+		case HEAPTUPLE_INSERT_IN_PROGRESS:
+			xid = HeapTupleHeaderGetXmin(tuple->t_data);
+			break;
+		case HEAPTUPLE_DEAD:
+			return;
+		default:
+
+			/*
+			 * The only way to get to this default clause is if a new value is
+			 * added to the enum type without adding it to this switch
+			 * statement.  That's a bug, so elog.
+			 */
+			elog(ERROR, "unrecognized return value from HeapTupleSatisfiesVacuum: %u", htsvResult);
+
+			/*
+			 * In spite of having all enum values covered and calling elog on
+			 * this default, some compilers think this is a code path which
+			 * allows xid to be used below without initialization. Silence
+			 * that warning.
+			 */
+			xid = InvalidTransactionId;
+	}
+
+	Assert(TransactionIdIsValid(xid));
+	Assert(TransactionIdFollowsOrEquals(xid, TransactionXmin));
+
+	/*
+	 * Find top level xid.  Bail out if xid is too early to be a conflict, or
+	 * if it's our own xid.
+	 */
+	if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
+		return;
+	xid = SubTransGetTopmostTransaction(xid);
+	if (TransactionIdPrecedes(xid, TransactionXmin))
+		return;
+
+	return CheckForSerializableConflictOut(relation, xid, snapshot);
+}
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index f1ff01e8cb9..70ed8c8af32 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2166,9 +2166,10 @@ heapam_scan_bitmap_next_block(TableScanDesc scan,
 			if (valid)
 			{
 				hscan->rs_vistuples[ntup++] = offnum;
-				PredicateLockTuple(scan->rs_rd, &loctup, snapshot);
+				PredicateLockTID(scan->rs_rd, &loctup.t_self, snapshot,
+								 HeapTupleHeaderGetXmin(loctup.t_data));
 			}
-			CheckForSerializableConflictOut(valid, scan->rs_rd, &loctup,
+			HeapCheckForSerializableConflictOut(valid, scan->rs_rd, &loctup,
 											buffer, snapshot);
 		}
 	}
@@ -2356,7 +2357,7 @@ heapam_scan_sample_next_tuple(TableScanDesc scan, SampleScanState *scanstate,
 
 			/* in pagemode, heapgetpage did this for us */
 			if (!pagemode)
-				CheckForSerializableConflictOut(visible, scan->rs_rd, tuple,
+				HeapCheckForSerializableConflictOut(visible, scan->rs_rd, tuple,
 												hscan->rs_cbuf, scan->rs_snapshot);
 
 			/* Try next tuple from same page. */
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 28edd4aca76..6f1093ae383 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -180,8 +180,8 @@ index_insert(Relation indexRelation,
 
 	if (!(indexRelation->rd_indam->ampredlocks))
 		CheckForSerializableConflictIn(indexRelation,
-									   (HeapTuple) NULL,
-									   InvalidBuffer);
+									   (ItemPointer) NULL,
+									   InvalidBlockNumber);
 
 	return indexRelation->rd_indam->aminsert(indexRelation, values, isnull,
 											 heap_t_ctid, heapRelation,
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index 5890f393f67..6f73fb73638 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -290,7 +290,7 @@ top:
 		 * checkingunique and !heapkeyspace cases, but it's okay to use the
 		 * first page the value could be on (with scantid omitted) instead.
 		 */
-		CheckForSerializableConflictIn(rel, NULL, insertstate.buf);
+		CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(insertstate.buf));
 
 		/*
 		 * Do the insertion.  Note that insertstate contains cached binary
@@ -533,7 +533,7 @@ _bt_check_unique(Relation rel, BTInsertState insertstate, Relation heapRel,
 					 * otherwise be masked by this unique constraint
 					 * violation.
 					 */
-					CheckForSerializableConflictIn(rel, NULL, insertstate->buf);
+					CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(insertstate->buf));
 
 					/*
 					 * This is a definite conflict.  Break the tuple down into
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 1417ba37441..3bd430baca9 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -163,8 +163,8 @@
  *		PredicateLockRelation(Relation relation, Snapshot snapshot)
  *		PredicateLockPage(Relation relation, BlockNumber blkno,
  *						Snapshot snapshot)
- *		PredicateLockTuple(Relation relation, HeapTuple tuple,
- *						Snapshot snapshot)
+ *		PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
+ *						 TransactionId insert_xid)
  *		PredicateLockPageSplit(Relation relation, BlockNumber oldblkno,
  *							   BlockNumber newblkno)
  *		PredicateLockPageCombine(Relation relation, BlockNumber oldblkno,
@@ -173,11 +173,10 @@
  *		ReleasePredicateLocks(bool isCommit, bool isReadOnlySafe)
  *
  * conflict detection (may also trigger rollback)
- *		CheckForSerializableConflictOut(bool visible, Relation relation,
- *										HeapTupleData *tup, Buffer buffer,
+ *		CheckForSerializableConflictOut(Relation relation, TransactionId xid,
  *										Snapshot snapshot)
- *		CheckForSerializableConflictIn(Relation relation, HeapTupleData *tup,
- *									   Buffer buffer)
+ *		CheckForSerializableConflictIn(Relation relation, ItemPointer tid,
+ *									   BlockNumber blkno)
  *		CheckTableForSerializableConflictIn(Relation relation)
  *
  * final rollback checking
@@ -193,8 +192,6 @@
 
 #include "postgres.h"
 
-#include "access/heapam.h"
-#include "access/htup_details.h"
 #include "access/parallel.h"
 #include "access/slru.h"
 #include "access/subtrans.h"
@@ -2538,30 +2535,28 @@ PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot)
 }
 
 /*
- *		PredicateLockTuple
+ *		PredicateLockTID
  *
  * Gets a predicate lock at the tuple level.
  * Skip if not in full serializable transaction isolation level.
  * Skip if this is a temporary table.
  */
 void
-PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot)
+PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
+				 TransactionId tuple_xid)
 {
 	PREDICATELOCKTARGETTAG tag;
-	ItemPointer tid;
-	TransactionId targetxmin;
 
 	if (!SerializationNeededForRead(relation, snapshot))
 		return;
 
 	/*
-	 * If it's a heap tuple, return if this xact wrote it.
+	 * Return if this xact wrote it.
 	 */
 	if (relation->rd_index == NULL)
 	{
 		/* If we wrote it; we already have a write lock. */
-		if (TransactionIdIsCurrentTransactionId(
-				HeapTupleHeaderGetXmin(tuple->t_data)))
+		if (TransactionIdIsCurrentTransactionId(tuple_xid))
 			return;
 	}
 
@@ -2577,7 +2572,6 @@ PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot)
 	if (PredicateLockExists(&tag))
 		return;
 
-	tid = &(tuple->t_self);
 	SET_PREDICATELOCKTARGETTAG_TUPLE(tag,
 									 relation->rd_node.dbNode,
 									 relation->rd_id,
@@ -4022,33 +4016,43 @@ XidIsConcurrent(TransactionId xid)
 	return false;
 }
 
+bool
+CheckForSerializableConflictOutNeeded(Relation relation, Snapshot snapshot)
+{
+	if (!SerializationNeededForRead(relation, snapshot))
+		return false;
+
+	/* Check if someone else has already decided that we need to die */
+	if (SxactIsDoomed(MySerializableXact))
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("could not serialize access due to read/write dependencies among transactions"),
+				 errdetail_internal("Reason code: Canceled on identification as a pivot, during conflict out checking."),
+				 errhint("The transaction might succeed if retried.")));
+	}
+
+	return true;
+}
+
 /*
  * CheckForSerializableConflictOut
  *		We are reading a tuple which has been modified.  If it is visible to
  *		us but has been deleted, that indicates a rw-conflict out.  If it's
  *		not visible and was created by a concurrent (overlapping)
- *		serializable transaction, that is also a rw-conflict out,
+ *		serializable transaction, that is also a rw-conflict out.
  *
  * We will determine the top level xid of the writing transaction with which
  * we may be in conflict, and check for overlap with our own transaction.
  * If the transactions overlap (i.e., they cannot see each other's writes),
  * then we have a conflict out.
- *
- * This function should be called just about anywhere in heapam.c where a
- * tuple has been read. The caller must hold at least a shared lock on the
- * buffer, because this function might set hint bits on the tuple. There is
- * currently no known reason to call this function from an index AM.
  */
 void
-CheckForSerializableConflictOut(bool visible, Relation relation,
-								HeapTuple tuple, Buffer buffer,
-								Snapshot snapshot)
+CheckForSerializableConflictOut(Relation relation, TransactionId xid, Snapshot snapshot)
 {
-	TransactionId xid;
 	SERIALIZABLEXIDTAG sxidtag;
 	SERIALIZABLEXID *sxid;
 	SERIALIZABLEXACT *sxact;
-	HTSV_Result htsvResult;
 
 	if (!SerializationNeededForRead(relation, snapshot))
 		return;
@@ -4062,64 +4066,8 @@ CheckForSerializableConflictOut(bool visible, Relation relation,
 				 errdetail_internal("Reason code: Canceled on identification as a pivot, during conflict out checking."),
 				 errhint("The transaction might succeed if retried.")));
 	}
-
-	/*
-	 * Check to see whether the tuple has been written to by a concurrent
-	 * transaction, either to create it not visible to us, or to delete it
-	 * while it is visible to us.  The "visible" bool indicates whether the
-	 * tuple is visible to us, while HeapTupleSatisfiesVacuum checks what else
-	 * is going on with it.
-	 */
-	htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
-	switch (htsvResult)
-	{
-		case HEAPTUPLE_LIVE:
-			if (visible)
-				return;
-			xid = HeapTupleHeaderGetXmin(tuple->t_data);
-			break;
-		case HEAPTUPLE_RECENTLY_DEAD:
-			if (!visible)
-				return;
-			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
-			break;
-		case HEAPTUPLE_DELETE_IN_PROGRESS:
-			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
-			break;
-		case HEAPTUPLE_INSERT_IN_PROGRESS:
-			xid = HeapTupleHeaderGetXmin(tuple->t_data);
-			break;
-		case HEAPTUPLE_DEAD:
-			return;
-		default:
-
-			/*
-			 * The only way to get to this default clause is if a new value is
-			 * added to the enum type without adding it to this switch
-			 * statement.  That's a bug, so elog.
-			 */
-			elog(ERROR, "unrecognized return value from HeapTupleSatisfiesVacuum: %u", htsvResult);
-
-			/*
-			 * In spite of having all enum values covered and calling elog on
-			 * this default, some compilers think this is a code path which
-			 * allows xid to be used below without initialization. Silence
-			 * that warning.
-			 */
-			xid = InvalidTransactionId;
-	}
 	Assert(TransactionIdIsValid(xid));
-	Assert(TransactionIdFollowsOrEquals(xid, TransactionXmin));
 
-	/*
-	 * Find top level xid.  Bail out if xid is too early to be a conflict, or
-	 * if it's our own xid.
-	 */
-	if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
-		return;
-	xid = SubTransGetTopmostTransaction(xid);
-	if (TransactionIdPrecedes(xid, TransactionXmin))
-		return;
 	if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
 		return;
 
@@ -4425,8 +4373,7 @@ CheckTargetForConflictsIn(PREDICATELOCKTARGETTAG *targettag)
  * tuple itself.
  */
 void
-CheckForSerializableConflictIn(Relation relation, HeapTuple tuple,
-							   Buffer buffer)
+CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber blkno)
 {
 	PREDICATELOCKTARGETTAG targettag;
 
@@ -4456,22 +4403,22 @@ CheckForSerializableConflictIn(Relation relation, HeapTuple tuple,
 	 * It is not possible to take and hold a lock across the checks for all
 	 * granularities because each target could be in a separate partition.
 	 */
-	if (tuple != NULL)
+	if (tid != NULL)
 	{
 		SET_PREDICATELOCKTARGETTAG_TUPLE(targettag,
 										 relation->rd_node.dbNode,
 										 relation->rd_id,
-										 ItemPointerGetBlockNumber(&(tuple->t_self)),
-										 ItemPointerGetOffsetNumber(&(tuple->t_self)));
+										 ItemPointerGetBlockNumber(tid),
+										 ItemPointerGetOffsetNumber(tid));
 		CheckTargetForConflictsIn(&targettag);
 	}
 
-	if (BufferIsValid(buffer))
+	if (blkno != InvalidBlockNumber)
 	{
 		SET_PREDICATELOCKTARGETTAG_PAGE(targettag,
 										relation->rd_node.dbNode,
 										relation->rd_id,
-										BufferGetBlockNumber(buffer));
+										blkno);
 		CheckTargetForConflictsIn(&targettag);
 	}
 
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc96..390023c0a55 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -217,5 +217,7 @@ extern bool ResolveCminCmaxDuringDecoding(struct HTAB *tuplecid_data,
 										  HeapTuple htup,
 										  Buffer buffer,
 										  CommandId *cmin, CommandId *cmax);
+extern void HeapCheckForSerializableConflictOut(bool valid, Relation relation, HeapTuple tuple,
+												Buffer buffer, Snapshot snapshot);
 
 #endif							/* HEAPAM_H */
diff --git a/src/include/storage/predicate.h b/src/include/storage/predicate.h
index 376245ecd70..7e1f67b8edb 100644
--- a/src/include/storage/predicate.h
+++ b/src/include/storage/predicate.h
@@ -57,16 +57,17 @@ extern void SetSerializableTransactionSnapshot(Snapshot snapshot,
 extern void RegisterPredicateLockingXid(TransactionId xid);
 extern void PredicateLockRelation(Relation relation, Snapshot snapshot);
 extern void PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot);
-extern void PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot);
+extern void PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
+							 TransactionId insert_xid);
 extern void PredicateLockPageSplit(Relation relation, BlockNumber oldblkno, BlockNumber newblkno);
 extern void PredicateLockPageCombine(Relation relation, BlockNumber oldblkno, BlockNumber newblkno);
 extern void TransferPredicateLocksToHeapRelation(Relation relation);
 extern void ReleasePredicateLocks(bool isCommit, bool isReadOnlySafe);
 
 /* conflict detection (may also trigger rollback) */
-extern void CheckForSerializableConflictOut(bool valid, Relation relation, HeapTuple tuple,
-											Buffer buffer, Snapshot snapshot);
-extern void CheckForSerializableConflictIn(Relation relation, HeapTuple tuple, Buffer buffer);
+extern bool CheckForSerializableConflictOutNeeded(Relation relation, Snapshot snapshot);
+extern void CheckForSerializableConflictOut(Relation relation, TransactionId xid, Snapshot snapshot);
+extern void CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber blkno);
 extern void CheckTableForSerializableConflictIn(Relation relation);
 
 /* final rollback checking */
-- 
2.19.1

#12Thomas Munro
thomas.munro@gmail.com
In reply to: Ashwin Agrawal (#11)
1 attachment(s)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

On Sat, Aug 3, 2019 at 11:56 AM Ashwin Agrawal <aagrawal@pivotal.io> wrote:

On Wed, Jul 31, 2019 at 2:06 PM Andres Freund <andres@anarazel.de> wrote:

I'm also a bit confused why we don't need to pass in the offset of the
current tuple, rather than the HOT root tuple here. That's not related
to this patch. But aren't we locking the wrong tuple here, in case of
HOT?

Yes, root is being locked here instead of the HOT. But I don't have full context on the same. If we wish to fix it though, can be easily done now with the patch by passing "tid" instead of &(heapTuple->t_self).

Here are three relevant commits:

1. Commit dafaa3efb75 "Implement genuine serializable isolation
level." (2011) locked the root tuple, because it set t_self to *tid.
Possibly due to confusion about the effect of the preceding line
ItemPointerSetOffsetNumber(tid, offnum).

2. Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self,
offnum).

3. Commit b89e151054a "Introduce logical decoding." (2014) also did
ItemPointerSet(&(heapTuple->t_self), BufferGetBlockNumber(buffer),
offnum), for the benefit of historical MVCC snapshots (unnecessarily,
considering the change in the commit #2), but then, intending to
"reset to original, non-redirected, tid", clobbered it, reintroducing
the bug fixed by #2.

My first guess is that commit #3 above was developed before commit #2,
and finished up clobbering it. In fact, both logical decoding and SSI
want offnum, so we should be able to just remove the "reset" bit
(perhaps like in the attached sketch, not really tested, though it
passes). This must be in want of an isolation test, but I haven't yet
tried to get my head around how to write a test that would show the
difference.

--
Thomas Munro
https://enterprisedb.com

Attachments:

fix.txttext/plain; charset=US-ASCII; name=fix.txtDownload
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 94309949fa..107605d276 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1576,7 +1576,6 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 		heapTuple->t_data = (HeapTupleHeader) PageGetItem(dp, lp);
 		heapTuple->t_len = ItemIdGetLength(lp);
 		heapTuple->t_tableOid = RelationGetRelid(relation);
-		ItemPointerSetOffsetNumber(&heapTuple->t_self, offnum);
 
 		/*
 		 * Shouldn't see a HEAP_ONLY tuple at chain start.
@@ -1608,15 +1607,14 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 			 * of the root tuple of the HOT chain. This is important because
 			 * the *Satisfies routine for historical mvcc snapshots needs the
 			 * correct tid to decide about the visibility in some cases.
+			 * SSI also needs offnum.
 			 */
-			ItemPointerSet(&(heapTuple->t_self), BufferGetBlockNumber(buffer), offnum);
+			ItemPointerSetOffsetNumber(&heapTuple->t_self, offnum);
 
 			/* If it's visible per the snapshot, we must return it */
 			valid = HeapTupleSatisfiesVisibility(heapTuple, snapshot, buffer);
 			CheckForSerializableConflictOut(valid, relation, heapTuple,
 											buffer, snapshot);
-			/* reset to original, non-redirected, tid */
-			heapTuple->t_self = *tid;
 
 			if (valid)
 			{
#13Andres Freund
andres@anarazel.de
In reply to: Thomas Munro (#12)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Hi,

On 2019-08-05 20:58:05 +1200, Thomas Munro wrote:

On Sat, Aug 3, 2019 at 11:56 AM Ashwin Agrawal <aagrawal@pivotal.io> wrote:

On Wed, Jul 31, 2019 at 2:06 PM Andres Freund <andres@anarazel.de> wrote:

I'm also a bit confused why we don't need to pass in the offset of the
current tuple, rather than the HOT root tuple here. That's not related
to this patch. But aren't we locking the wrong tuple here, in case of
HOT?

Yes, root is being locked here instead of the HOT. But I don't have full context on the same. If we wish to fix it though, can be easily done now with the patch by passing "tid" instead of &(heapTuple->t_self).

Here are three relevant commits:

Thanks for digging!

1. Commit dafaa3efb75 "Implement genuine serializable isolation
level." (2011) locked the root tuple, because it set t_self to *tid.
Possibly due to confusion about the effect of the preceding line
ItemPointerSetOffsetNumber(tid, offnum).

2. Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self,
offnum).

Hm. It's not at all sure that it's ok to report the non-root tuple tid
here. I.e. I'm fairly sure there was a reason to not just set it to the
actual tid. I think I might have written that up on the list at some
point. Let me dig in memory and list. Obviously possible that that was
also obsoleted by parallel changes.

3. Commit b89e151054a "Introduce logical decoding." (2014) also did
ItemPointerSet(&(heapTuple->t_self), BufferGetBlockNumber(buffer),
offnum), for the benefit of historical MVCC snapshots (unnecessarily,
considering the change in the commit #2), but then, intending to
"reset to original, non-redirected, tid", clobbered it, reintroducing
the bug fixed by #2.

My first guess is that commit #3 above was developed before commit #2,
and finished up clobbering it.

Yea, that sounds likely.

This must be in want of an isolation test, but I haven't yet tried to
get my head around how to write a test that would show the difference.

Indeed.

Greetings,

Andres Freund

#14Thomas Munro
thomas.munro@gmail.com
In reply to: Andres Freund (#13)
1 attachment(s)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

On Tue, Aug 6, 2019 at 4:35 AM Andres Freund <andres@anarazel.de> wrote:

On 2019-08-05 20:58:05 +1200, Thomas Munro wrote:

1. Commit dafaa3efb75 "Implement genuine serializable isolation
level." (2011) locked the root tuple, because it set t_self to *tid.
Possibly due to confusion about the effect of the preceding line
ItemPointerSetOffsetNumber(tid, offnum).

2. Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self,
offnum).

Hm. It's not at all sure that it's ok to report the non-root tuple tid
here. I.e. I'm fairly sure there was a reason to not just set it to the
actual tid. I think I might have written that up on the list at some
point. Let me dig in memory and list. Obviously possible that that was
also obsoleted by parallel changes.

Adding Heikki and Kevin.

I haven't found your earlier discussion about that yet, but would be
keen to read it if you can find it. I wondered if your argument
might have had something to do with heap pruning, but I can't find a
problem there. It's not as though the TID of any visible tuples
change, it's just that some dead stuff goes away and the root line
pointer is changed to LP_REDIRECT if it wasn't already.

As for the argument for locking the tuple we emit rather than the HOT
root, I think the questions are: (1) How exactly do we get away with
locking only one version in a regular non-HOT update chain? (2) Is it
OK to do that with a HOT root?

The answer to the first question is given in README-SSI[1]Excerpt from README-SSI:.
Unfortunately it doesn't discuss HOT directly, but I suspect the
answer is no, HOT is not special here. By my reading, it relies on
the version you lock being the version visible to your snapshot, which
is important because later updates have to touch that tuple to write
the next version. That doesn't apply to some arbitrarily older tuple
that happens to be a HOT root. Concretely, heap_update() does
CheckForSerializableConflictIn(relation, &oldtup, buffer), which is
only going to produce a rw conflict if T1 took an SIREAD on precisely
the version T2 locks in that path, not some arbitrarily older version
that happens to be a HOT root. A HOT root might never be considered
again by concurrent writers, no?

As a minor consequence, the optimisation in
CheckTargetForConflictsIn() assumes that a tuple being updated has the
same tag as we locked when reading the tuple, which isn't the case if
we locked the root while reading but now have the TID for the version
we actually read, so in master we leak a tuple lock unnecessarily
until end-of-transaction when we update a HOT tuple.

This must be in want of an isolation test, but I haven't yet tried to
get my head around how to write a test that would show the difference.

Indeed.

One practical problem is that the only way to reach
PredicateLockTuple() is from an index scan, and the index scan locks
the index page (or the whole index, depending on
rd_indam->ampredlocks). So I think if you want to see a serialization
anomaly you'll need multiple indexes (so that index page locks don't
hide the problem), a long enough HOT chain and then probably several
transactions to be able to miss a cycle that should be picked up by
the logic in [1]Excerpt from README-SSI:. I'm out of steam for this problem today though.

The simple test from the report[3]/messages/by-id/523C29A8.20904@vmware.com that resulted in commit 81fbbfe3352
doesn't work for me (ie with permutation "r1" "r2" "w1" "w2" "c1" "c2"
twice in a row). The best I've come up with so far is an assertion
that we predicate-lock the same row version that we emitted to the
user, when reached via an index lookup that visits a HOT row. The
test outputs 'f' for master, but 't' with the change to heapam.c.

[1]: Excerpt from README-SSI:

===
* PostgreSQL does not use "update in place" with a rollback log
for its MVCC implementation. Where possible it uses "HOT" updates on
the same page (if there is room and no indexed value is changed).
For non-HOT updates the old tuple is expired in place and a new tuple
is inserted at a new location. Because of this difference, a tuple
lock in PostgreSQL doesn't automatically lock any other versions of a
row. We don't try to copy or expand a tuple lock to any other
versions of the row, based on the following proof that any additional
serialization failures we would get from that would be false
positives:

o If transaction T1 reads a row version (thus acquiring a
predicate lock on it) and a second transaction T2 updates that row
version (thus creating a rw-conflict graph edge from T1 to T2), must a
third transaction T3 which re-updates the new version of the row also
have a rw-conflict in from T1 to prevent anomalies? In other words,
does it matter whether we recognize the edge T1 -> T3?

o If T1 has a conflict in, it certainly doesn't. Adding the
edge T1 -> T3 would create a dangerous structure, but we already had
one from the edge T1 -> T2, so we would have aborted something anyway.
(T2 has already committed, else T3 could not have updated its output;
but we would have aborted either T1 or T1's predecessor(s). Hence
no cycle involving T1 and T3 can survive.)

o Now let's consider the case where T1 doesn't have a
rw-conflict in. If that's the case, for this edge T1 -> T3 to make a
difference, T3 must have a rw-conflict out that induces a cycle in the
dependency graph, i.e. a conflict out to some transaction preceding T1
in the graph. (A conflict out to T1 itself would be problematic too,
but that would mean T1 has a conflict in, the case we already
eliminated.)

o So now we're trying to figure out if there can be an
rw-conflict edge T3 -> T0, where T0 is some transaction that precedes
T1. For T0 to precede T1, there has to be some edge, or sequence of
edges, from T0 to T1. At least the last edge has to be a wr-dependency
or ww-dependency rather than a rw-conflict, because T1 doesn't have a
rw-conflict in. And that gives us enough information about the order
of transactions to see that T3 can't have a rw-conflict to T0:
- T0 committed before T1 started (the wr/ww-dependency implies this)
- T1 started before T2 committed (the T1->T2 rw-conflict implies this)
- T2 committed before T3 started (otherwise, T3 would get aborted
because of an update conflict)

o That means T0 committed before T3 started, and therefore
there can't be a rw-conflict from T3 to T0.

o So in all cases, we don't need the T1 -> T3 edge to
recognize cycles. Therefore it's not necessary for T1's SIREAD lock
on the original tuple version to cover later versions as well.
===

[2]: /messages/by-id/52527E4D.4060302@vmware.com
[3]: /messages/by-id/523C29A8.20904@vmware.com

--
Thomas Munro
https://enterprisedb.com

Attachments:

0001-Predicate-lock-the-visible-heap-tuple-not-the-HOT-ro.patchapplication/octet-stream; name=0001-Predicate-lock-the-visible-heap-tuple-not-the-HOT-ro.patchDownload
From 95942f0459d3e1db08ae14c5ee1f65e12be2e036 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.munro@gmail.com>
Date: Tue, 6 Aug 2019 13:49:58 +1200
Subject: [PATCH] Predicate-lock the visible heap tuple, not the HOT root.

Commit 81fbbfe3352 fixed a bug where we predicate-locked the HOT root.
Commit b89e151054a unfixed it.  Refix it.  Add an isolation test that
demonstrates that we locked the row version that we actually emitted.
Back-patch to all supported version.

Author: Thomas Munro
Reported-by: Andres Freund
Discussion: https://postgr.es/m/CALfoeiv0k3hkEb3Oqk%3DziWqtyk2Jys1UOK5hwRBNeANT_yX%2Bng%40mail.gmail.com
---
 src/backend/access/heap/heapam.c              |  6 ++---
 .../isolation/expected/serializable-hot.out   | 16 +++++++++++
 src/test/isolation/isolation_schedule         |  1 +
 .../isolation/specs/serializable-hot.spec     | 27 +++++++++++++++++++
 4 files changed, 46 insertions(+), 4 deletions(-)
 create mode 100644 src/test/isolation/expected/serializable-hot.out
 create mode 100644 src/test/isolation/specs/serializable-hot.spec

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 94309949fa..107605d276 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1576,7 +1576,6 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 		heapTuple->t_data = (HeapTupleHeader) PageGetItem(dp, lp);
 		heapTuple->t_len = ItemIdGetLength(lp);
 		heapTuple->t_tableOid = RelationGetRelid(relation);
-		ItemPointerSetOffsetNumber(&heapTuple->t_self, offnum);
 
 		/*
 		 * Shouldn't see a HEAP_ONLY tuple at chain start.
@@ -1608,15 +1607,14 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 			 * of the root tuple of the HOT chain. This is important because
 			 * the *Satisfies routine for historical mvcc snapshots needs the
 			 * correct tid to decide about the visibility in some cases.
+			 * SSI also needs offnum.
 			 */
-			ItemPointerSet(&(heapTuple->t_self), BufferGetBlockNumber(buffer), offnum);
+			ItemPointerSetOffsetNumber(&heapTuple->t_self, offnum);
 
 			/* If it's visible per the snapshot, we must return it */
 			valid = HeapTupleSatisfiesVisibility(heapTuple, snapshot, buffer);
 			CheckForSerializableConflictOut(valid, relation, heapTuple,
 											buffer, snapshot);
-			/* reset to original, non-redirected, tid */
-			heapTuple->t_self = *tid;
 
 			if (valid)
 			{
diff --git a/src/test/isolation/expected/serializable-hot.out b/src/test/isolation/expected/serializable-hot.out
new file mode 100644
index 0000000000..d2bacd77fa
--- /dev/null
+++ b/src/test/isolation/expected/serializable-hot.out
@@ -0,0 +1,16 @@
+Parsed test spec with 1 sessions
+
+starting permutation: r1 s1 c1
+step r1: SELECT t FROM test WHERE i = 42
+t              
+
+banana sundae  
+step s1: SELECT ('('||page||','||tuple||')')::tid = (SELECT ctid FROM test) AS locked_visible
+              FROM pg_locks
+             WHERE locktype = 'tuple'
+               AND relation = 'test'::regclass
+               AND mode = 'SIReadLock'
+locked_visible 
+
+t              
+step c1: COMMIT;
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 009c2ec54b..8ac88e9c77 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -86,3 +86,4 @@ test: plpgsql-toast
 test: truncate-conflict
 test: serializable-parallel
 test: serializable-parallel-2
+test: serializable-hot
diff --git a/src/test/isolation/specs/serializable-hot.spec b/src/test/isolation/specs/serializable-hot.spec
new file mode 100644
index 0000000000..21d1050cca
--- /dev/null
+++ b/src/test/isolation/specs/serializable-hot.spec
@@ -0,0 +1,27 @@
+# Assert that when we access a tuple that is part of a HOT chain, we acquire a
+# tuple lock on the tuple we actually see, even though the index points at
+# an older version.
+
+setup
+{
+  CREATE TABLE test (i int PRIMARY KEY, t text);
+  INSERT INTO test VALUES (42, 'banana');
+  UPDATE test SET t = t || ' sundae';
+}
+
+teardown
+{
+  DROP TABLE test;
+}
+
+session "s1"
+setup { BEGIN ISOLATION LEVEL SERIALIZABLE; SET enable_seqscan=off; }
+step "r1" { SELECT t FROM test WHERE i = 42 }
+step "s1" { SELECT ('('||page||','||tuple||')')::tid = (SELECT ctid FROM test) AS locked_visible
+              FROM pg_locks
+             WHERE locktype = 'tuple'
+               AND relation = 'test'::regclass
+               AND mode = 'SIReadLock' }
+step "c1" { COMMIT; }
+
+permutation "r1" "s1" "c1"
-- 
2.22.0

#15Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Thomas Munro (#14)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

Hello Thomas,

On Tue, Aug 6, 2019 at 9:50 AM Thomas Munro <thomas.munro@gmail.com> wrote:

On Tue, Aug 6, 2019 at 4:35 AM Andres Freund <andres@anarazel.de> wrote:

On 2019-08-05 20:58:05 +1200, Thomas Munro wrote:

1. Commit dafaa3efb75 "Implement genuine serializable isolation
level." (2011) locked the root tuple, because it set t_self to *tid.
Possibly due to confusion about the effect of the preceding line
ItemPointerSetOffsetNumber(tid, offnum).

2. Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self,
offnum).

Hm. It's not at all sure that it's ok to report the non-root tuple tid
here. I.e. I'm fairly sure there was a reason to not just set it to the
actual tid. I think I might have written that up on the list at some
point. Let me dig in memory and list. Obviously possible that that was
also obsoleted by parallel changes.

Adding Heikki and Kevin.

I haven't found your earlier discussion about that yet, but would be
keen to read it if you can find it. I wondered if your argument
might have had something to do with heap pruning, but I can't find a
problem there. It's not as though the TID of any visible tuples
change, it's just that some dead stuff goes away and the root line
pointer is changed to LP_REDIRECT if it wasn't already.

As for the argument for locking the tuple we emit rather than the HOT
root, I think the questions are: (1) How exactly do we get away with
locking only one version in a regular non-HOT update chain? (2) Is it
OK to do that with a HOT root?

The answer to the first question is given in README-SSI[1].
Unfortunately it doesn't discuss HOT directly, but I suspect the
answer is no, HOT is not special here. By my reading, it relies on
the version you lock being the version visible to your snapshot, which
is important because later updates have to touch that tuple to write
the next version. That doesn't apply to some arbitrarily older tuple
that happens to be a HOT root. Concretely, heap_update() does
CheckForSerializableConflictIn(relation, &oldtup, buffer), which is
only going to produce a rw conflict if T1 took an SIREAD on precisely
the version T2 locks in that path, not some arbitrarily older version
that happens to be a HOT root. A HOT root might never be considered
again by concurrent writers, no?

If I understand the problem, this is the same serialization issue as
with in-place updates for zheap. I had a discussion with Kevin
regarding the same in this thread [1]Re: In-place updates and serializable transactions: /messages/by-id/CAGz5QCJzreUqJqHeXrbEs6xb0zCNKBHhOj6D9Tjd3btJTzydxg@mail.gmail.com. It seems if we're locking the
hot root id, we may report some false positive serializable errors.

This must be in want of an isolation test, but I haven't yet tried to
get my head around how to write a test that would show the difference.

Indeed.

One practical problem is that the only way to reach
PredicateLockTuple() is from an index scan, and the index scan locks
the index page (or the whole index, depending on
rd_indam->ampredlocks). So I think if you want to see a serialization
anomaly you'll need multiple indexes (so that index page locks don't
hide the problem), a long enough HOT chain and then probably several
transactions to be able to miss a cycle that should be picked up by
the logic in [1]. I'm out of steam for this problem today though.

The simple test from the report[3] that resulted in commit 81fbbfe3352
doesn't work for me (ie with permutation "r1" "r2" "w1" "w2" "c1" "c2"
twice in a row). The best I've come up with so far is an assertion
that we predicate-lock the same row version that we emitted to the
user, when reached via an index lookup that visits a HOT row. The
test outputs 'f' for master, but 't' with the change to heapam.c.

Here is an example from the multiple-row-versions isolation test which
fails if we perform in-place updates for zheap. I think the same will
be relevant if we lock root tuple id instead of the tuple itself.
Step 1: T1-> BEGIN; Read FROM t where id=1000000;
Step 2: T2-> BEGIN; UPDATE t where id=1000000; COMMIT; (creates T1->T2)
Step 3: T3-> BEGIN; UPDATE t where id=1000000; Read FROM t where id=500000;
Step 4: T4-> BEGIN; UPDATE t where id= 500000; Read FROM t where id=1;
COMMIT; (creates T3->T4)
Step 5: T3-> COMMIT;
Step 6: T1-> UPDATE t where id=1; COMMIT; (creates T4->T1,)

At step 6, when the update statement is executed, T1 is rolled back
because of T3->T4->T1.

But for zheap, step 3 also creates a dependency T1->T3 because of
in-place update. When T4 commits in step 4, it marks T3 as doomed
because of T1 --> T3 --> T4. Hence, in step 5, T3 is rolled back.

[1]: Re: In-place updates and serializable transactions: /messages/by-id/CAGz5QCJzreUqJqHeXrbEs6xb0zCNKBHhOj6D9Tjd3btJTzydxg@mail.gmail.com
/messages/by-id/CAGz5QCJzreUqJqHeXrbEs6xb0zCNKBHhOj6D9Tjd3btJTzydxg@mail.gmail.com

--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

#16Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Thomas Munro (#14)
1 attachment(s)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

On 06/08/2019 07:20, Thomas Munro wrote:

On Tue, Aug 6, 2019 at 4:35 AM Andres Freund <andres@anarazel.de> wrote:

On 2019-08-05 20:58:05 +1200, Thomas Munro wrote:

1. Commit dafaa3efb75 "Implement genuine serializable isolation
level." (2011) locked the root tuple, because it set t_self to *tid.
Possibly due to confusion about the effect of the preceding line
ItemPointerSetOffsetNumber(tid, offnum).

2. Commit commit 81fbbfe3352 "Fix bugs in SSI tuple locking." (2013)
fixed that by adding ItemPointerSetOffsetNumber(&heapTuple->t_self,
offnum).

Hm. It's not at all sure that it's ok to report the non-root tuple tid
here. I.e. I'm fairly sure there was a reason to not just set it to the
actual tid. I think I might have written that up on the list at some
point. Let me dig in memory and list. Obviously possible that that was
also obsoleted by parallel changes.

Adding Heikki and Kevin.

I haven't found your earlier discussion about that yet, but would be
keen to read it if you can find it. I wondered if your argument
might have had something to do with heap pruning, but I can't find a
problem there. It's not as though the TID of any visible tuples
change, it's just that some dead stuff goes away and the root line
pointer is changed to LP_REDIRECT if it wasn't already.

As for the argument for locking the tuple we emit rather than the HOT
root, I think the questions are: (1) How exactly do we get away with
locking only one version in a regular non-HOT update chain? (2) Is it
OK to do that with a HOT root?

The answer to the first question is given in README-SSI[1].
Unfortunately it doesn't discuss HOT directly, but I suspect the
answer is no, HOT is not special here. By my reading, it relies on
the version you lock being the version visible to your snapshot, which
is important because later updates have to touch that tuple to write
the next version. That doesn't apply to some arbitrarily older tuple
that happens to be a HOT root. Concretely, heap_update() does
CheckForSerializableConflictIn(relation, &oldtup, buffer), which is
only going to produce a rw conflict if T1 took an SIREAD on precisely
the version T2 locks in that path, not some arbitrarily older version
that happens to be a HOT root. A HOT root might never be considered
again by concurrent writers, no?

Your analysis is spot on. Thanks for the clear write-up!

This must be in want of an isolation test, but I haven't yet tried to
get my head around how to write a test that would show the difference.

Indeed.

One practical problem is that the only way to reach
PredicateLockTuple() is from an index scan, and the index scan locks
the index page (or the whole index, depending on
rd_indam->ampredlocks). So I think if you want to see a serialization
anomaly you'll need multiple indexes (so that index page locks don't
hide the problem), a long enough HOT chain and then probably several
transactions to be able to miss a cycle that should be picked up by
the logic in [1]. I'm out of steam for this problem today though.

I had some steam, and wrote a spec that reproduces this bug. It wasn't
actually that hard to reproduce, fortunately. Or unfortunately; people
might well be hitting it in production. I used the "freezetest.spec"
from the 2013 thread as the starting point, and added one UPDATE to the
initialization, so that the test starts with an already HOT-updated
tuple. It should throw a serialization error, but on current master, it
does not. After applying your fix.txt, it does.

Your fix.txt seems correct. For clarity, I'd prefer moving things around
a bit, though, so that the t_self is set earlier in the function, at the
same place where the other HeapTuple fields are set. And set blkno and
offnum together, in one ItemPointerSet call. With that, I'm not sure we
need such a verbose comment explaining why t_self needs to be updated
but I kept it for now.

Attached is a patch that contains your fix.txt with the changes for
clarity mentioned above, and an isolationtester test case.

PS. Because heap_hot_search_buffer() now always sets heapTuple->t_self
to the returned tuple version, updating *tid is redundant. And the call
in heapam_index_fetch_tuple() wouldn't need to do
"bslot->base.tupdata.t_self = *tid".

- Heikki

Attachments:

0001-Fix-predicate-locking-of-HOT-updated-rows.patchtext/x-patch; name=0001-Fix-predicate-locking-of-HOT-updated-rows.patchDownload
From 23d07e6d5b259e1fd2fe7694cde51212920fbb3a Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Tue, 6 Aug 2019 12:14:14 +0300
Subject: [PATCH 1/1] Fix predicate-locking of HOT updated rows.

In serializable mode, heap_hot_search_buffer() incorrectly acquired a
predicate lock on the root tuple, not the returned tuple that satisfied
the visibility checks. As explained in README-SSI, the predicate lock does
not need to be copied or extended to other tuple versions, but for that to
work, the correct, visible, tuple version must be locked in the first
place.

The original SSI commit had this bug in it, but it was fixed back in 2013,
in commit 81fbbfe335. But unfortunately, it was reintroduced a few months
later in commit b89e151054. Wising up from that, add a regression test
to cover this, so that it doesn't get reintroduced again. Also, move the
code that sets 't_self', so that it happens at the same time that the
other HeapTuple fields are set, to make it more clear that all the code in
the loop operate on the "current" tuple in the chain, not the root tuple.

Bug spotted by Andres Freund, analysis and original fix by Thomas Munro,
test case and some additional changes to the fix by Heikki Linnakangas.

Discussion: https://www.postgresql.org/message-id/20190731210630.nqhszuktygwftjty%40alap3.anarazel.de
---
 src/backend/access/heap/heapam.c              | 27 ++++++--------
 .../expected/predicate-lock-hot-tuple.out     | 20 ++++++++++
 src/test/isolation/isolation_schedule         |  1 +
 .../specs/predicate-lock-hot-tuple.spec       | 37 +++++++++++++++++++
 4 files changed, 69 insertions(+), 16 deletions(-)
 create mode 100644 src/test/isolation/expected/predicate-lock-hot-tuple.out
 create mode 100644 src/test/isolation/specs/predicate-lock-hot-tuple.spec

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index e33f019939..e81a390ec6 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -1529,6 +1529,7 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 {
 	Page		dp = (Page) BufferGetPage(buffer);
 	TransactionId prev_xmax = InvalidTransactionId;
+	BlockNumber blkno;
 	OffsetNumber offnum;
 	bool		at_chain_start;
 	bool		valid;
@@ -1538,14 +1539,13 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 	if (all_dead)
 		*all_dead = first_call;
 
-	Assert(TransactionIdIsValid(RecentGlobalXmin));
-
-	Assert(ItemPointerGetBlockNumber(tid) == BufferGetBlockNumber(buffer));
+	blkno = ItemPointerGetBlockNumber(tid);
 	offnum = ItemPointerGetOffsetNumber(tid);
 	at_chain_start = first_call;
 	skip = !first_call;
 
-	heapTuple->t_self = *tid;
+	Assert(TransactionIdIsValid(RecentGlobalXmin));
+	Assert(BufferGetBlockNumber(buffer) == blkno);
 
 	/* Scan through possible multiple members of HOT-chain */
 	for (;;)
@@ -1573,10 +1573,16 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 			break;
 		}
 
+		/*
+		 * Update heapTuple to point to the element of the HOT chain we're
+		 * currently investigating. Having t_self set correctly is important
+		 * because the SSI checks and the *Satisfies routine for historical
+		 * MVCC snapshots need the correct tid to decide about the visibility.
+		 */
 		heapTuple->t_data = (HeapTupleHeader) PageGetItem(dp, lp);
 		heapTuple->t_len = ItemIdGetLength(lp);
 		heapTuple->t_tableOid = RelationGetRelid(relation);
-		ItemPointerSetOffsetNumber(&heapTuple->t_self, offnum);
+		ItemPointerSet(&heapTuple->t_self, blkno, offnum);
 
 		/*
 		 * Shouldn't see a HEAP_ONLY tuple at chain start.
@@ -1602,21 +1608,10 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 		 */
 		if (!skip)
 		{
-			/*
-			 * For the benefit of logical decoding, have t_self point at the
-			 * element of the HOT chain we're currently investigating instead
-			 * of the root tuple of the HOT chain. This is important because
-			 * the *Satisfies routine for historical mvcc snapshots needs the
-			 * correct tid to decide about the visibility in some cases.
-			 */
-			ItemPointerSet(&(heapTuple->t_self), BufferGetBlockNumber(buffer), offnum);
-
 			/* If it's visible per the snapshot, we must return it */
 			valid = HeapTupleSatisfiesVisibility(heapTuple, snapshot, buffer);
 			CheckForSerializableConflictOut(valid, relation, heapTuple,
 											buffer, snapshot);
-			/* reset to original, non-redirected, tid */
-			heapTuple->t_self = *tid;
 
 			if (valid)
 			{
diff --git a/src/test/isolation/expected/predicate-lock-hot-tuple.out b/src/test/isolation/expected/predicate-lock-hot-tuple.out
new file mode 100644
index 0000000000..d1c69bbbd0
--- /dev/null
+++ b/src/test/isolation/expected/predicate-lock-hot-tuple.out
@@ -0,0 +1,20 @@
+Parsed test spec with 2 sessions
+
+starting permutation: b1 b2 r1 r2 w1 w2 c1 c2
+step b1: BEGIN ISOLATION LEVEL SERIALIZABLE;
+step b2: BEGIN ISOLATION LEVEL SERIALIZABLE;
+step r1: SELECT * FROM test WHERE i IN (5, 7)
+i              t              
+
+5              apple          
+7              pear_hot_updated
+step r2: SELECT * FROM test WHERE i IN (5, 7)
+i              t              
+
+5              apple          
+7              pear_hot_updated
+step w1: UPDATE test SET t = 'pear_xact1' WHERE i = 7
+step w2: UPDATE test SET t = 'apple_xact2' WHERE i = 5
+step c1: COMMIT;
+step c2: COMMIT;
+ERROR:  could not serialize access due to read/write dependencies among transactions
diff --git a/src/test/isolation/isolation_schedule b/src/test/isolation/isolation_schedule
index 009c2ec54b..69ae227953 100644
--- a/src/test/isolation/isolation_schedule
+++ b/src/test/isolation/isolation_schedule
@@ -17,6 +17,7 @@ test: partial-index
 test: two-ids
 test: multiple-row-versions
 test: index-only-scan
+test: predicate-lock-hot-tuple
 test: deadlock-simple
 test: deadlock-hard
 test: deadlock-soft
diff --git a/src/test/isolation/specs/predicate-lock-hot-tuple.spec b/src/test/isolation/specs/predicate-lock-hot-tuple.spec
new file mode 100644
index 0000000000..d16fb60533
--- /dev/null
+++ b/src/test/isolation/specs/predicate-lock-hot-tuple.spec
@@ -0,0 +1,37 @@
+# Test predicate locks on HOT updated tuples.
+#
+# This test has two serializable transactions. Both select two rows
+# from the table, and then update one of them.
+# If these were serialized (run one at a time), the transaction that
+# runs later would see one of the rows to be updated.
+#
+# Any overlap between the transactions must cause a serialization failure.
+# We used to have a bug in predicate locking HOT updated tuples, which
+# caused the conflict to be missed, if the row was HOT updated.
+
+setup
+{
+  CREATE TABLE test (i int PRIMARY KEY, t text);
+  INSERT INTO test VALUES (5, 'apple'), (7, 'pear'), (11, 'banana');
+  -- HOT-update 'pear' row.
+  UPDATE test SET t = 'pear_hot_updated' WHERE i = 7;
+}
+
+teardown
+{
+  DROP TABLE test;
+}
+
+session "s1"
+step "b1" { BEGIN ISOLATION LEVEL SERIALIZABLE; }
+step "r1" { SELECT * FROM test WHERE i IN (5, 7) }
+step "w1" { UPDATE test SET t = 'pear_xact1' WHERE i = 7 }
+step "c1" { COMMIT; }
+
+session "s2"
+step "b2" { BEGIN ISOLATION LEVEL SERIALIZABLE; }
+step "r2" { SELECT * FROM test WHERE i IN (5, 7) }
+step "w2" { UPDATE test SET t = 'apple_xact2' WHERE i = 5 }
+step "c2" { COMMIT; }
+
+permutation "b1" "b2" "r1" "r2" "w1" "w2" "c1" "c2"
-- 
2.20.1

#17Thomas Munro
thomas.munro@gmail.com
In reply to: Heikki Linnakangas (#16)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

On Tue, Aug 6, 2019 at 9:26 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

I had some steam, and wrote a spec that reproduces this bug. It wasn't
actually that hard to reproduce, fortunately. Or unfortunately; people
might well be hitting it in production. I used the "freezetest.spec"
from the 2013 thread as the starting point, and added one UPDATE to the
initialization, so that the test starts with an already HOT-updated
tuple. It should throw a serialization error, but on current master, it
does not. After applying your fix.txt, it does.

Thanks! Ahh, right, I was expecting it to be harder to see an
undetected anomaly, because of the index page lock, but of course we
never actually write to that page so it's just the heap tuple lock
holding everything together.

Your fix.txt seems correct. For clarity, I'd prefer moving things around
a bit, though, so that the t_self is set earlier in the function, at the
same place where the other HeapTuple fields are set. And set blkno and
offnum together, in one ItemPointerSet call. With that, I'm not sure we
need such a verbose comment explaining why t_self needs to be updated
but I kept it for now.

+1

Attached is a patch that contains your fix.txt with the changes for
clarity mentioned above, and an isolationtester test case.

LGTM.

PS. Because heap_hot_search_buffer() now always sets heapTuple->t_self
to the returned tuple version, updating *tid is redundant. And the call
in heapam_index_fetch_tuple() wouldn't need to do
"bslot->base.tupdata.t_self = *tid".

Right, that sounds like a separate improvement for master only.

--
Thomas Munro
https://enterprisedb.com

#18Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Thomas Munro (#17)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

On 06/08/2019 13:35, Thomas Munro wrote:

On Tue, Aug 6, 2019 at 9:26 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

Attached is a patch that contains your fix.txt with the changes for
clarity mentioned above, and an isolationtester test case.

LGTM.

Pushed, thanks!

- Heikki

#19Ashwin Agrawal
aagrawal@pivotal.io
In reply to: Ashwin Agrawal (#11)
3 attachment(s)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

On Fri, Aug 2, 2019 at 4:56 PM Ashwin Agrawal <aagrawal@pivotal.io> wrote:

On Wed, Jul 31, 2019 at 2:06 PM Andres Freund <andres@anarazel.de> wrote:

Looking at the code as of master, we currently have:

Super awesome feedback and insights, thank you!

- PredicateLockTuple() calls SubTransGetTopmostTransaction() to figure

out a whether the tuple has been locked by the current
transaction. That check afaict just should be
TransactionIdIsCurrentTransactionId(), without all the other
stuff that's done today.

Agree. v1-0002 patch attached does that now. Please let me know if that's
what you meant.

TransactionIdIsCurrentTransactionId() imo ought to be optimized to

always check for the top level transactionid first - that's a good bet
today, but even moreso for the upcoming AMs that won't have separate
xids for subtransactions. Alternatively we shouldn't make that a
binary search for each subtrans level, but just have a small
simplehash hashtable for xids.

v1-0001 patch checks for GetTopTransactionIdIfAny() first in
TransactionIdIsCurrentTransactionId() which seems yes better in general and
more for future. That mostly meets the needs for current discussion.

The alternative of not using binary search seems bigger refactoring and
should be handled as separate optimization exercise outside of this thread.

- CheckForSerializableConflictOut() wants to get the toplevel xid for
the tuple, because that's the one the predicate hashtable stores.

In your patch you've already moved the HTSV() call etc out of
CheckForSerializableConflictOut(). I'm somewhat inclined to think that
the SubTransGetTopmostTransaction() call ought to go along with that.
I don't really think that belongs in predicate.c, especially if
most/all new AMs don't use subtransaction ids.

The only downside is that currently the
TransactionIdEquals(xid, GetTopTransactionIdIfAny()) check
avoids the SubTransGetTopmostTransaction() check.

But again, the better fix for that seems to be to improve the generic
code. As written the check won't prevent a subtrans lookup for heap
when subtransactions are in use, and it's IME pretty common for tuples
to get looked at again in the transaction that has created them. So
I'm somewhat inclined to think that SubTransGetTopmostTransaction()
should have a fast-path for the current transaction - probably just
employing TransactionIdIsCurrentTransactionId().

That optimization, as Kuntal also mentioned, seems something which can be
done on-top afterwards on current patch.

I don't really see what we gain by having the subtrans handling in the
predicate code. Especially given that we've already moved the HTSV()
handling out, it seems architecturally the wrong place to me - but I
admit that that's a fuzzy argument. The relevant mapping should be one
line in the caller.

Okay, I moved the sub transaction handling out of
CheckForSerializableConflictOut() and have it along side HTSV() now.

The reason I felt leaving subtransaction handling in generic place, was it
might be premature to thing no future AM will need it. Plus, all
serializable function api's having same expectations is easier. Like
PredicateLockTuple() can be passed top or subtransaction id and it can
handle it but with the change CheckForSerializableConflictOut() only be
feed top transaction ID. But its fine and can see the point of AM needing
it can easily get top transaction ID and feed it as heap.

I wonder if it'd be wroth to combine the
TransactionIdIsCurrentTransactionId() calls in the heap cases that
currently do both, PredicateLockTuple() and
HeapCheckForSerializableConflictOut(). The heap_fetch() case probably
isn't commonly that hot a pathq, but heap_hot_search_buffer() is.

Maybe, will give thought to it separate from the current patch.

Minor notes:
- I don't think 'insert_xid' is necessarily great - it could also be the
updating xid etc. And while you can argue that an update is an insert
in the current heap, that's not the case for future AMs.
- to me
@@ -1621,7 +1622,8 @@ heap_hot_search_buffer(ItemPointer tid, Relation
relation, Buffer buffer,
if (valid)
{
ItemPointerSetOffsetNumber(tid, offnum);
-                               PredicateLockTuple(relation, heapTuple,
snapshot);
+                               PredicateLockTID(relation,
&(heapTuple)->t_self, snapshot,
+
HeapTupleHeaderGetXmin(heapTuple->t_data));
if (all_dead)
*all_dead = false;
return true;

What are those parens - as placed they can't do anything. Did you
intend to write &(heapTuple->t_self)? Even that is pretty superfluous,
but it at least clarifies the precedence.

Fixed. No idea what I was thinking there, mostly feel I intended to have
it as like &(heapTuple->t_self).

I'm also a bit confused why we don't need to pass in the offset of the

current tuple, rather than the HOT root tuple here. That's not related
to this patch. But aren't we locking the wrong tuple here, in case of
HOT?

Yes, root is being locked here instead of the HOT. But I don't have full
context on the same. If we wish to fix it though, can be easily done now
with the patch by passing "tid" instead of &(heapTuple->t_self).

- I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a

portion of it's code as a static inline. In particular, it's a shame
that we currently perform external function calls at quite the
frequency when serializable isn't even in use.

I am not sure on portion of the code part? SerializationNeededForRead() is
static inline function in C file. Can't inline
CheckForSerializableConflictOutNeeded() without moving
SerializationNeededForRead() and some other variables to header file.
CheckForSerializableConflictOut() wasn't inline either, so a function call
was performed earlier as well when serializable isn't even in use.

I understand that with refactor, HeapCheckForSerializableConflictOut() is
called which calls CheckForSerializableConflictOutNeeded(). If that's the
problem, for addressing the same, I had proposed alternative way to
refactor. CheckForSerializableConflictOut() can take callback function and
void* callback argument for AM specific check instead. So, the flow would
be AM calling CheckForSerializableConflictOut() as today and only if
serializable in use will invoke the callback to check with AM if more work
should be performed or not. Essentially
HeapCheckForSerializableConflictOut() will become callback function
instead. Due to void* callback argument aspect I didn't like that solution
and felt AM performing checks and calling CheckForSerializableConflictOut()
seems more straight forward.

Attaching re-based version of the patches on top of current master, which
has the fix for HOT serializable predicate locking bug spotted by Andres
committed now.

Attachments:

v2-0001-Optimize-TransactionIdIsCurrentTransactionId.patchtext/x-patch; charset=US-ASCII; name=v2-0001-Optimize-TransactionIdIsCurrentTransactionId.patchDownload
From aec03114cea4235265461f8c427f0633a24a7ec3 Mon Sep 17 00:00:00 2001
From: Ashwin Agrawal <aagrawal@pivotal.io>
Date: Fri, 2 Aug 2019 15:02:28 -0700
Subject: [PATCH v2 1/3] Optimize TransactionIdIsCurrentTransactionId()

If xid is current top transaction, can fast check for the same and
exit early. This should optmize for current heap but also works very
well for future AMs, which don't have separate xid for
subtransactions. This optimization was proposed by Andres.
---
 src/backend/access/transam/xact.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index 1bbaeeebf4d..41952bc4d26 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -871,6 +871,9 @@ TransactionIdIsCurrentTransactionId(TransactionId xid)
 	if (!TransactionIdIsNormal(xid))
 		return false;
 
+	if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
+		return true;
+
 	/*
 	 * In parallel workers, the XIDs we must consider as current are stored in
 	 * ParallelCurrentXids rather than the transaction-state stack.  Note that
-- 
2.19.1

v2-0002-Optimize-PredicateLockTuple.patchtext/x-patch; charset=US-ASCII; name=v2-0002-Optimize-PredicateLockTuple.patchDownload
From 703d123ffd0a35d11e67b4cb07ceaeca1a527425 Mon Sep 17 00:00:00 2001
From: Ashwin Agrawal <aagrawal@pivotal.io>
Date: Fri, 2 Aug 2019 15:21:16 -0700
Subject: [PATCH v2 2/3] Optimize PredicateLockTuple().

PredicateLockTuple fast exits if tuple was written by current
transaction, as for that case it already has the lock. This check can
be easily performed using TransactionIdIsCurrentTransactionId()
instead of using SubTransGetTopmostTransaction(). Since
TransactionIdIsCurrentTransactionId() is all in-memory operation,
makes this efficient compared to SubTransGetTopmostTransaction(),
which can hit the disk.

This simplification was proposed by Andres.
---
 src/backend/storage/lmgr/predicate.c | 22 ++++------------------
 1 file changed, 4 insertions(+), 18 deletions(-)

diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 85a629f4fce..f4d8c2528a1 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -2559,24 +2559,10 @@ PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot)
 	 */
 	if (relation->rd_index == NULL)
 	{
-		TransactionId myxid;
-
-		targetxmin = HeapTupleHeaderGetXmin(tuple->t_data);
-
-		myxid = GetTopTransactionIdIfAny();
-		if (TransactionIdIsValid(myxid))
-		{
-			if (TransactionIdFollowsOrEquals(targetxmin, TransactionXmin))
-			{
-				TransactionId xid = SubTransGetTopmostTransaction(targetxmin);
-
-				if (TransactionIdEquals(xid, myxid))
-				{
-					/* We wrote it; we already have a write lock. */
-					return;
-				}
-			}
-		}
+		/* If we wrote it; we already have a write lock. */
+		if (TransactionIdIsCurrentTransactionId(
+				HeapTupleHeaderGetXmin(tuple->t_data)))
+			return;
 	}
 
 	/*
-- 
2.19.1

v2-0003-Remove-HeapTuple-dependency-for-predicate-locking.patchtext/x-patch; charset=US-ASCII; name=v2-0003-Remove-HeapTuple-dependency-for-predicate-locking.patchDownload
From 3b3b90768afea0609513f473ff3d9cdd86ef74c1 Mon Sep 17 00:00:00 2001
From: Ashwin Agrawal <aagrawal@pivotal.io>
Date: Fri, 2 Aug 2019 11:44:21 -0700
Subject: [PATCH v2 3/3] Remove HeapTuple dependency for predicate locking
 functions.

Following changes help to make predicate locking functions generic and
not tied to particular AM.

- PredicateLockTuple() is renamed to PredicateLockTID(). It takes
  ItemPointer and inserting transaction id now instead of HeapTuple.

- CheckForSerializableConflictIn() takes blocknum instead of buffer

- CheckForSerializableConflictOut() no more takes HeapTuple nor buffer
---
 src/backend/access/gin/ginbtree.c        |   2 +-
 src/backend/access/gin/ginfast.c         |   2 +-
 src/backend/access/gin/gininsert.c       |   4 +-
 src/backend/access/gist/gist.c           |   2 +-
 src/backend/access/hash/hashinsert.c     |   2 +-
 src/backend/access/heap/heapam.c         | 119 ++++++++++++++++++---
 src/backend/access/heap/heapam_handler.c |   7 +-
 src/backend/access/index/indexam.c       |   4 +-
 src/backend/access/nbtree/nbtinsert.c    |   4 +-
 src/backend/storage/lmgr/predicate.c     | 127 +++++++----------------
 src/include/access/heapam.h              |   2 +
 src/include/storage/predicate.h          |   9 +-
 12 files changed, 164 insertions(+), 120 deletions(-)

diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c
index 4c29261256a..adeb1d2aa04 100644
--- a/src/backend/access/gin/ginbtree.c
+++ b/src/backend/access/gin/ginbtree.c
@@ -89,7 +89,7 @@ ginFindLeafPage(GinBtree btree, bool searchMode,
 	stack->predictNumber = 1;
 
 	if (rootConflictCheck)
-		CheckForSerializableConflictIn(btree->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(btree->index, NULL, btree->rootBlkno);
 
 	for (;;)
 	{
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 439a91b3e61..d7b52476817 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -246,7 +246,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
 	 * tree, so it conflicts with all serializable scans.  All scans acquire a
 	 * predicate lock on the metabuffer to represent that.
 	 */
-	CheckForSerializableConflictIn(index, NULL, metabuffer);
+	CheckForSerializableConflictIn(index, NULL, GIN_METAPAGE_BLKNO);
 
 	if (collector->sumsize + collector->ntuples * sizeof(ItemIdData) > GinListPageSize)
 	{
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index 55eab146173..046a20a3d41 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -221,7 +221,7 @@ ginEntryInsert(GinState *ginstate,
 			return;
 		}
 
-		CheckForSerializableConflictIn(ginstate->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(ginstate->index, NULL, BufferGetBlockNumber(stack->buffer));
 		/* modify an existing leaf entry */
 		itup = addItemPointersToLeafTuple(ginstate, itup,
 										  items, nitem, buildStats, stack->buffer);
@@ -230,7 +230,7 @@ ginEntryInsert(GinState *ginstate,
 	}
 	else
 	{
-		CheckForSerializableConflictIn(ginstate->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(ginstate->index, NULL, BufferGetBlockNumber(stack->buffer));
 		/* no match, so construct a new leaf entry */
 		itup = buildFreshLeafTuple(ginstate, attnum, key, category,
 								   items, nitem, buildStats, stack->buffer);
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index e9ca4b82527..4db54a42761 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -1263,7 +1263,7 @@ gistinserttuples(GISTInsertState *state, GISTInsertStack *stack,
 	 * Check for any rw conflicts (in serializable isolation level) just
 	 * before we intend to modify the page
 	 */
-	CheckForSerializableConflictIn(state->r, NULL, stack->buffer);
+	CheckForSerializableConflictIn(state->r, NULL, BufferGetBlockNumber(stack->buffer));
 
 	/* Insert the tuple(s) to the page, splitting the page if necessary */
 	is_split = gistplacetopage(state->r, state->freespace, giststate,
diff --git a/src/backend/access/hash/hashinsert.c b/src/backend/access/hash/hashinsert.c
index 89876d2ccd0..c87fa5970e3 100644
--- a/src/backend/access/hash/hashinsert.c
+++ b/src/backend/access/hash/hashinsert.c
@@ -88,7 +88,7 @@ restart_insert:
 										  &usedmetap);
 	Assert(usedmetap != NULL);
 
-	CheckForSerializableConflictIn(rel, NULL, buf);
+	CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(buf));
 
 	/* remember the primary bucket buffer to release the pin on it at end. */
 	bucket_buf = buf;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 6b42bdc77b2..e23c57f9391 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -40,6 +40,7 @@
 #include "access/multixact.h"
 #include "access/parallel.h"
 #include "access/relscan.h"
+#include "access/subtrans.h"
 #include "access/sysattr.h"
 #include "access/tableam.h"
 #include "access/transam.h"
@@ -446,7 +447,7 @@ heapgetpage(TableScanDesc sscan, BlockNumber page)
 			else
 				valid = HeapTupleSatisfiesVisibility(&loctup, snapshot, buffer);
 
-			CheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
+			HeapCheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
 											&loctup, buffer, snapshot);
 
 			if (valid)
@@ -668,7 +669,7 @@ heapgettup(HeapScanDesc scan,
 													 snapshot,
 													 scan->rs_cbuf);
 
-				CheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
+				HeapCheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
 												tuple, scan->rs_cbuf,
 												snapshot);
 
@@ -1477,9 +1478,10 @@ heap_fetch(Relation relation,
 	valid = HeapTupleSatisfiesVisibility(tuple, snapshot, buffer);
 
 	if (valid)
-		PredicateLockTuple(relation, tuple, snapshot);
+		PredicateLockTID(relation, &(tuple->t_self), snapshot,
+						 HeapTupleHeaderGetXmin(tuple->t_data));
 
-	CheckForSerializableConflictOut(valid, relation, tuple, buffer, snapshot);
+	HeapCheckForSerializableConflictOut(valid, relation, tuple, buffer, snapshot);
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
@@ -1610,13 +1612,14 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 		{
 			/* If it's visible per the snapshot, we must return it */
 			valid = HeapTupleSatisfiesVisibility(heapTuple, snapshot, buffer);
-			CheckForSerializableConflictOut(valid, relation, heapTuple,
-											buffer, snapshot);
+			HeapCheckForSerializableConflictOut(valid, relation, heapTuple,
+												buffer, snapshot);
 
 			if (valid)
 			{
 				ItemPointerSetOffsetNumber(tid, offnum);
-				PredicateLockTuple(relation, heapTuple, snapshot);
+				PredicateLockTID(relation, &heapTuple->t_self, snapshot,
+								 HeapTupleHeaderGetXmin(heapTuple->t_data));
 				if (all_dead)
 					*all_dead = false;
 				return true;
@@ -1750,7 +1753,7 @@ heap_get_latest_tid(TableScanDesc sscan,
 		 * candidate.
 		 */
 		valid = HeapTupleSatisfiesVisibility(&tp, snapshot, buffer);
-		CheckForSerializableConflictOut(valid, relation, &tp, buffer, snapshot);
+		HeapCheckForSerializableConflictOut(valid, relation, &tp, buffer, snapshot);
 		if (valid)
 			*tid = ctid;
 
@@ -1905,7 +1908,7 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 	 * lock "gaps" as index page locks do.  So we don't need to specify a
 	 * buffer when making the call, which makes for a faster check.
 	 */
-	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
+	CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber);
 
 	/* NO EREPORT(ERROR) from here till changes are logged */
 	START_CRIT_SECTION();
@@ -2159,7 +2162,7 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 	 * lock "gaps" as index page locks do.  So we don't need to specify a
 	 * buffer when making the call, which makes for a faster check.
 	 */
-	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
+	CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber);
 
 	ndone = 0;
 	while (ndone < ntuples)
@@ -2350,7 +2353,7 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 	 * lock "gaps" as index page locks do.  So we don't need to specify a
 	 * buffer when making the call.
 	 */
-	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
+	CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber);
 
 	/*
 	 * If tuples are cachable, mark them for invalidation from the caches in
@@ -2664,7 +2667,7 @@ l1:
 	 * being visible to the scan (i.e., an exclusive buffer content lock is
 	 * continuously held from this point until the tuple delete is visible).
 	 */
-	CheckForSerializableConflictIn(relation, &tp, buffer);
+	CheckForSerializableConflictIn(relation, tid, BufferGetBlockNumber(buffer));
 
 	/* replace cid with a combo cid if necessary */
 	HeapTupleHeaderAdjustCmax(tp.t_data, &cid, &iscombo);
@@ -3579,7 +3582,7 @@ l2:
 	 * will include checking the relation level, there is no benefit to a
 	 * separate check for the new tuple.
 	 */
-	CheckForSerializableConflictIn(relation, &oldtup, buffer);
+	CheckForSerializableConflictIn(relation, otid, BufferGetBlockNumber(buffer));
 
 	/*
 	 * At this point newbuf and buffer are both pinned and locked, and newbuf
@@ -9039,3 +9042,93 @@ heap_mask(char *pagedata, BlockNumber blkno)
 		}
 	}
 }
+
+/*
+ * HeapCheckForSerializableConflictOut
+ *		We are reading a tuple which has been modified.  If it is visible to
+ *		us but has been deleted, that indicates a rw-conflict out.  If it's
+ *		not visible and was created by a concurrent (overlapping)
+ *		serializable transaction, that is also a rw-conflict out,
+ *
+ * We will determine the top level xid of the writing transaction with which
+ * we may be in conflict, and check for overlap with our own transaction.
+ * If the transactions overlap (i.e., they cannot see each other's writes),
+ * then we have a conflict out.
+ *
+ * This function should be called just about anywhere in heapam.c where a
+ * tuple has been read. The caller must hold at least a shared lock on the
+ * buffer, because this function might set hint bits on the tuple. There is
+ * currently no known reason to call this function from an index AM.
+ */
+void
+HeapCheckForSerializableConflictOut(bool visible, Relation relation,
+									HeapTuple tuple, Buffer buffer,
+									Snapshot snapshot)
+{
+	TransactionId xid;
+	HTSV_Result htsvResult;
+
+	if (!CheckForSerializableConflictOutNeeded(relation, snapshot))
+		return;
+
+	/*
+	 * Check to see whether the tuple has been written to by a concurrent
+	 * transaction, either to create it not visible to us, or to delete it
+	 * while it is visible to us.  The "visible" bool indicates whether the
+	 * tuple is visible to us, while HeapTupleSatisfiesVacuum checks what else
+	 * is going on with it.
+	 */
+	htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
+	switch (htsvResult)
+	{
+		case HEAPTUPLE_LIVE:
+			if (visible)
+				return;
+			xid = HeapTupleHeaderGetXmin(tuple->t_data);
+			break;
+		case HEAPTUPLE_RECENTLY_DEAD:
+			if (!visible)
+				return;
+			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
+			break;
+		case HEAPTUPLE_DELETE_IN_PROGRESS:
+			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
+			break;
+		case HEAPTUPLE_INSERT_IN_PROGRESS:
+			xid = HeapTupleHeaderGetXmin(tuple->t_data);
+			break;
+		case HEAPTUPLE_DEAD:
+			return;
+		default:
+
+			/*
+			 * The only way to get to this default clause is if a new value is
+			 * added to the enum type without adding it to this switch
+			 * statement.  That's a bug, so elog.
+			 */
+			elog(ERROR, "unrecognized return value from HeapTupleSatisfiesVacuum: %u", htsvResult);
+
+			/*
+			 * In spite of having all enum values covered and calling elog on
+			 * this default, some compilers think this is a code path which
+			 * allows xid to be used below without initialization. Silence
+			 * that warning.
+			 */
+			xid = InvalidTransactionId;
+	}
+
+	Assert(TransactionIdIsValid(xid));
+	Assert(TransactionIdFollowsOrEquals(xid, TransactionXmin));
+
+	/*
+	 * Find top level xid.  Bail out if xid is too early to be a conflict, or
+	 * if it's our own xid.
+	 */
+	if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
+		return;
+	xid = SubTransGetTopmostTransaction(xid);
+	if (TransactionIdPrecedes(xid, TransactionXmin))
+		return;
+
+	return CheckForSerializableConflictOut(relation, xid, snapshot);
+}
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index f1ff01e8cb9..70ed8c8af32 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2166,9 +2166,10 @@ heapam_scan_bitmap_next_block(TableScanDesc scan,
 			if (valid)
 			{
 				hscan->rs_vistuples[ntup++] = offnum;
-				PredicateLockTuple(scan->rs_rd, &loctup, snapshot);
+				PredicateLockTID(scan->rs_rd, &loctup.t_self, snapshot,
+								 HeapTupleHeaderGetXmin(loctup.t_data));
 			}
-			CheckForSerializableConflictOut(valid, scan->rs_rd, &loctup,
+			HeapCheckForSerializableConflictOut(valid, scan->rs_rd, &loctup,
 											buffer, snapshot);
 		}
 	}
@@ -2356,7 +2357,7 @@ heapam_scan_sample_next_tuple(TableScanDesc scan, SampleScanState *scanstate,
 
 			/* in pagemode, heapgetpage did this for us */
 			if (!pagemode)
-				CheckForSerializableConflictOut(visible, scan->rs_rd, tuple,
+				HeapCheckForSerializableConflictOut(visible, scan->rs_rd, tuple,
 												hscan->rs_cbuf, scan->rs_snapshot);
 
 			/* Try next tuple from same page. */
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 28edd4aca76..6f1093ae383 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -180,8 +180,8 @@ index_insert(Relation indexRelation,
 
 	if (!(indexRelation->rd_indam->ampredlocks))
 		CheckForSerializableConflictIn(indexRelation,
-									   (HeapTuple) NULL,
-									   InvalidBuffer);
+									   (ItemPointer) NULL,
+									   InvalidBlockNumber);
 
 	return indexRelation->rd_indam->aminsert(indexRelation, values, isnull,
 											 heap_t_ctid, heapRelation,
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index 5890f393f67..6f73fb73638 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -290,7 +290,7 @@ top:
 		 * checkingunique and !heapkeyspace cases, but it's okay to use the
 		 * first page the value could be on (with scantid omitted) instead.
 		 */
-		CheckForSerializableConflictIn(rel, NULL, insertstate.buf);
+		CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(insertstate.buf));
 
 		/*
 		 * Do the insertion.  Note that insertstate contains cached binary
@@ -533,7 +533,7 @@ _bt_check_unique(Relation rel, BTInsertState insertstate, Relation heapRel,
 					 * otherwise be masked by this unique constraint
 					 * violation.
 					 */
-					CheckForSerializableConflictIn(rel, NULL, insertstate->buf);
+					CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(insertstate->buf));
 
 					/*
 					 * This is a definite conflict.  Break the tuple down into
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index f4d8c2528a1..4677363be93 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -163,8 +163,8 @@
  *		PredicateLockRelation(Relation relation, Snapshot snapshot)
  *		PredicateLockPage(Relation relation, BlockNumber blkno,
  *						Snapshot snapshot)
- *		PredicateLockTuple(Relation relation, HeapTuple tuple,
- *						Snapshot snapshot)
+ *		PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
+ *						 TransactionId insert_xid)
  *		PredicateLockPageSplit(Relation relation, BlockNumber oldblkno,
  *							   BlockNumber newblkno)
  *		PredicateLockPageCombine(Relation relation, BlockNumber oldblkno,
@@ -173,11 +173,10 @@
  *		ReleasePredicateLocks(bool isCommit, bool isReadOnlySafe)
  *
  * conflict detection (may also trigger rollback)
- *		CheckForSerializableConflictOut(bool visible, Relation relation,
- *										HeapTupleData *tup, Buffer buffer,
+ *		CheckForSerializableConflictOut(Relation relation, TransactionId xid,
  *										Snapshot snapshot)
- *		CheckForSerializableConflictIn(Relation relation, HeapTupleData *tup,
- *									   Buffer buffer)
+ *		CheckForSerializableConflictIn(Relation relation, ItemPointer tid,
+ *									   BlockNumber blkno)
  *		CheckTableForSerializableConflictIn(Relation relation)
  *
  * final rollback checking
@@ -193,8 +192,6 @@
 
 #include "postgres.h"
 
-#include "access/heapam.h"
-#include "access/htup_details.h"
 #include "access/parallel.h"
 #include "access/slru.h"
 #include "access/subtrans.h"
@@ -2538,30 +2535,28 @@ PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot)
 }
 
 /*
- *		PredicateLockTuple
+ *		PredicateLockTID
  *
  * Gets a predicate lock at the tuple level.
  * Skip if not in full serializable transaction isolation level.
  * Skip if this is a temporary table.
  */
 void
-PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot)
+PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
+				 TransactionId tuple_xid)
 {
 	PREDICATELOCKTARGETTAG tag;
-	ItemPointer tid;
-	TransactionId targetxmin;
 
 	if (!SerializationNeededForRead(relation, snapshot))
 		return;
 
 	/*
-	 * If it's a heap tuple, return if this xact wrote it.
+	 * Return if this xact wrote it.
 	 */
 	if (relation->rd_index == NULL)
 	{
 		/* If we wrote it; we already have a write lock. */
-		if (TransactionIdIsCurrentTransactionId(
-				HeapTupleHeaderGetXmin(tuple->t_data)))
+		if (TransactionIdIsCurrentTransactionId(tuple_xid))
 			return;
 	}
 
@@ -2577,7 +2572,6 @@ PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot)
 	if (PredicateLockExists(&tag))
 		return;
 
-	tid = &(tuple->t_self);
 	SET_PREDICATELOCKTARGETTAG_TUPLE(tag,
 									 relation->rd_node.dbNode,
 									 relation->rd_id,
@@ -4022,33 +4016,43 @@ XidIsConcurrent(TransactionId xid)
 	return false;
 }
 
+bool
+CheckForSerializableConflictOutNeeded(Relation relation, Snapshot snapshot)
+{
+	if (!SerializationNeededForRead(relation, snapshot))
+		return false;
+
+	/* Check if someone else has already decided that we need to die */
+	if (SxactIsDoomed(MySerializableXact))
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("could not serialize access due to read/write dependencies among transactions"),
+				 errdetail_internal("Reason code: Canceled on identification as a pivot, during conflict out checking."),
+				 errhint("The transaction might succeed if retried.")));
+	}
+
+	return true;
+}
+
 /*
  * CheckForSerializableConflictOut
  *		We are reading a tuple which has been modified.  If it is visible to
  *		us but has been deleted, that indicates a rw-conflict out.  If it's
  *		not visible and was created by a concurrent (overlapping)
- *		serializable transaction, that is also a rw-conflict out,
+ *		serializable transaction, that is also a rw-conflict out.
  *
  * We will determine the top level xid of the writing transaction with which
  * we may be in conflict, and check for overlap with our own transaction.
  * If the transactions overlap (i.e., they cannot see each other's writes),
  * then we have a conflict out.
- *
- * This function should be called just about anywhere in heapam.c where a
- * tuple has been read. The caller must hold at least a shared lock on the
- * buffer, because this function might set hint bits on the tuple. There is
- * currently no known reason to call this function from an index AM.
  */
 void
-CheckForSerializableConflictOut(bool visible, Relation relation,
-								HeapTuple tuple, Buffer buffer,
-								Snapshot snapshot)
+CheckForSerializableConflictOut(Relation relation, TransactionId xid, Snapshot snapshot)
 {
-	TransactionId xid;
 	SERIALIZABLEXIDTAG sxidtag;
 	SERIALIZABLEXID *sxid;
 	SERIALIZABLEXACT *sxact;
-	HTSV_Result htsvResult;
 
 	if (!SerializationNeededForRead(relation, snapshot))
 		return;
@@ -4062,64 +4066,8 @@ CheckForSerializableConflictOut(bool visible, Relation relation,
 				 errdetail_internal("Reason code: Canceled on identification as a pivot, during conflict out checking."),
 				 errhint("The transaction might succeed if retried.")));
 	}
-
-	/*
-	 * Check to see whether the tuple has been written to by a concurrent
-	 * transaction, either to create it not visible to us, or to delete it
-	 * while it is visible to us.  The "visible" bool indicates whether the
-	 * tuple is visible to us, while HeapTupleSatisfiesVacuum checks what else
-	 * is going on with it.
-	 */
-	htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
-	switch (htsvResult)
-	{
-		case HEAPTUPLE_LIVE:
-			if (visible)
-				return;
-			xid = HeapTupleHeaderGetXmin(tuple->t_data);
-			break;
-		case HEAPTUPLE_RECENTLY_DEAD:
-			if (!visible)
-				return;
-			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
-			break;
-		case HEAPTUPLE_DELETE_IN_PROGRESS:
-			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
-			break;
-		case HEAPTUPLE_INSERT_IN_PROGRESS:
-			xid = HeapTupleHeaderGetXmin(tuple->t_data);
-			break;
-		case HEAPTUPLE_DEAD:
-			return;
-		default:
-
-			/*
-			 * The only way to get to this default clause is if a new value is
-			 * added to the enum type without adding it to this switch
-			 * statement.  That's a bug, so elog.
-			 */
-			elog(ERROR, "unrecognized return value from HeapTupleSatisfiesVacuum: %u", htsvResult);
-
-			/*
-			 * In spite of having all enum values covered and calling elog on
-			 * this default, some compilers think this is a code path which
-			 * allows xid to be used below without initialization. Silence
-			 * that warning.
-			 */
-			xid = InvalidTransactionId;
-	}
 	Assert(TransactionIdIsValid(xid));
-	Assert(TransactionIdFollowsOrEquals(xid, TransactionXmin));
 
-	/*
-	 * Find top level xid.  Bail out if xid is too early to be a conflict, or
-	 * if it's our own xid.
-	 */
-	if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
-		return;
-	xid = SubTransGetTopmostTransaction(xid);
-	if (TransactionIdPrecedes(xid, TransactionXmin))
-		return;
 	if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
 		return;
 
@@ -4425,8 +4373,7 @@ CheckTargetForConflictsIn(PREDICATELOCKTARGETTAG *targettag)
  * tuple itself.
  */
 void
-CheckForSerializableConflictIn(Relation relation, HeapTuple tuple,
-							   Buffer buffer)
+CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber blkno)
 {
 	PREDICATELOCKTARGETTAG targettag;
 
@@ -4456,22 +4403,22 @@ CheckForSerializableConflictIn(Relation relation, HeapTuple tuple,
 	 * It is not possible to take and hold a lock across the checks for all
 	 * granularities because each target could be in a separate partition.
 	 */
-	if (tuple != NULL)
+	if (tid != NULL)
 	{
 		SET_PREDICATELOCKTARGETTAG_TUPLE(targettag,
 										 relation->rd_node.dbNode,
 										 relation->rd_id,
-										 ItemPointerGetBlockNumber(&(tuple->t_self)),
-										 ItemPointerGetOffsetNumber(&(tuple->t_self)));
+										 ItemPointerGetBlockNumber(tid),
+										 ItemPointerGetOffsetNumber(tid));
 		CheckTargetForConflictsIn(&targettag);
 	}
 
-	if (BufferIsValid(buffer))
+	if (blkno != InvalidBlockNumber)
 	{
 		SET_PREDICATELOCKTARGETTAG_PAGE(targettag,
 										relation->rd_node.dbNode,
 										relation->rd_id,
-										BufferGetBlockNumber(buffer));
+										blkno);
 		CheckTargetForConflictsIn(&targettag);
 	}
 
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc96..390023c0a55 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -217,5 +217,7 @@ extern bool ResolveCminCmaxDuringDecoding(struct HTAB *tuplecid_data,
 										  HeapTuple htup,
 										  Buffer buffer,
 										  CommandId *cmin, CommandId *cmax);
+extern void HeapCheckForSerializableConflictOut(bool valid, Relation relation, HeapTuple tuple,
+												Buffer buffer, Snapshot snapshot);
 
 #endif							/* HEAPAM_H */
diff --git a/src/include/storage/predicate.h b/src/include/storage/predicate.h
index 376245ecd70..7e1f67b8edb 100644
--- a/src/include/storage/predicate.h
+++ b/src/include/storage/predicate.h
@@ -57,16 +57,17 @@ extern void SetSerializableTransactionSnapshot(Snapshot snapshot,
 extern void RegisterPredicateLockingXid(TransactionId xid);
 extern void PredicateLockRelation(Relation relation, Snapshot snapshot);
 extern void PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot);
-extern void PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot);
+extern void PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
+							 TransactionId insert_xid);
 extern void PredicateLockPageSplit(Relation relation, BlockNumber oldblkno, BlockNumber newblkno);
 extern void PredicateLockPageCombine(Relation relation, BlockNumber oldblkno, BlockNumber newblkno);
 extern void TransferPredicateLocksToHeapRelation(Relation relation);
 extern void ReleasePredicateLocks(bool isCommit, bool isReadOnlySafe);
 
 /* conflict detection (may also trigger rollback) */
-extern void CheckForSerializableConflictOut(bool valid, Relation relation, HeapTuple tuple,
-											Buffer buffer, Snapshot snapshot);
-extern void CheckForSerializableConflictIn(Relation relation, HeapTuple tuple, Buffer buffer);
+extern bool CheckForSerializableConflictOutNeeded(Relation relation, Snapshot snapshot);
+extern void CheckForSerializableConflictOut(Relation relation, TransactionId xid, Snapshot snapshot);
+extern void CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber blkno);
 extern void CheckTableForSerializableConflictIn(Relation relation);
 
 /* final rollback checking */
-- 
2.19.1

#20Thomas Munro
thomas.munro@gmail.com
In reply to: Ashwin Agrawal (#19)
3 attachment(s)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

On Thu, Aug 8, 2019 at 6:53 AM Ashwin Agrawal <aagrawal@pivotal.io> wrote:

- I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a
portion of it's code as a static inline. In particular, it's a shame
that we currently perform external function calls at quite the
frequency when serializable isn't even in use.

I am not sure on portion of the code part? SerializationNeededForRead() is static inline function in C file. Can't inline CheckForSerializableConflictOutNeeded() without moving SerializationNeededForRead() and some other variables to header file. CheckForSerializableConflictOut() wasn't inline either, so a function call was performed earlier as well when serializable isn't even in use.

I agree that it's strange that we do these high frequency function
calls just to figure out that we're not even using this stuff, which
ultimately comes down to the static global variable MySerializableXact
being not reachable from (say) an inline function defined in a header.
That's something to look into in another thread.

Attaching re-based version of the patches on top of current master, which has the fix for HOT serializable predicate locking bug spotted by Andres committed now.

I'm planning to commit these three patches on Monday. I've attached
versions with whitespace-only changes from pgindent, and commit
messages lightly massaged and updated to point to this discussion and
reviewers.

Attachments:

v3-0001-Optimize-TransactionIdIsCurrentTransactionId.patchapplication/octet-stream; name=v3-0001-Optimize-TransactionIdIsCurrentTransactionId.patchDownload
From 5e99b32238234cea86fc1b9195a358f05e3a8efc Mon Sep 17 00:00:00 2001
From: Ashwin Agrawal <aagrawal@pivotal.io>
Date: Fri, 2 Aug 2019 15:02:28 -0700
Subject: [PATCH v3 1/3] Optimize TransactionIdIsCurrentTransactionId().

If xid is the current top transaction, we can do a fast check and exit
early.  This should work well for the current heap but also works very
well for proposed AMs that don't have separate xid for subtransactions.

This optimization was proposed by Andres Freund.

Author: Ashwin Agrawal
Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/CALfoeiv0k3hkEb3Oqk%3DziWqtyk2Jys1UOK5hwRBNeANT_yX%2Bng%40mail.gmail.com
---
 src/backend/access/transam/xact.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index fc55fa6d53..e5e2902465 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -871,6 +871,9 @@ TransactionIdIsCurrentTransactionId(TransactionId xid)
 	if (!TransactionIdIsNormal(xid))
 		return false;
 
+	if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
+		return true;
+
 	/*
 	 * In parallel workers, the XIDs we must consider as current are stored in
 	 * ParallelCurrentXids rather than the transaction-state stack.  Note that
-- 
2.23.0

v3-0002-Optimize-PredicateLockTuple.patchapplication/octet-stream; name=v3-0002-Optimize-PredicateLockTuple.patchDownload
From e73abc40fc6bbc087bbcaa109ef6c5b2fe642853 Mon Sep 17 00:00:00 2001
From: Ashwin Agrawal <aagrawal@pivotal.io>
Date: Fri, 2 Aug 2019 15:21:16 -0700
Subject: [PATCH v3 2/3] Optimize PredicateLockTuple().

PredicateLockTuple() has a fast exit if tuple was written by current
transaction, as for that case it already has a lock.  This check can
be performed using TransactionIdIsCurrentTransactionId() instead of
using SubTransGetTopmostTransaction().  Since
TransactionIdIsCurrentTransactionId() is an all-in-memory operation,
it makes this efficient compared to SubTransGetTopmostTransaction(),
which can hit the disk.

This simplification was proposed by Andres Freund.

Author: Ashwin Agrawal
Reviewed-by: Thomas Munro
Discussion: https://postgr.es/m/CALfoeiv0k3hkEb3Oqk%3DziWqtyk2Jys1UOK5hwRBNeANT_yX%2Bng%40mail.gmail.com
---
 src/backend/storage/lmgr/predicate.c | 21 +++------------------
 1 file changed, 3 insertions(+), 18 deletions(-)

diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 78fb90fb1b..d5468b3e29 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -2559,24 +2559,9 @@ PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot)
 	 */
 	if (relation->rd_index == NULL)
 	{
-		TransactionId myxid;
-
-		targetxmin = HeapTupleHeaderGetXmin(tuple->t_data);
-
-		myxid = GetTopTransactionIdIfAny();
-		if (TransactionIdIsValid(myxid))
-		{
-			if (TransactionIdFollowsOrEquals(targetxmin, TransactionXmin))
-			{
-				TransactionId xid = SubTransGetTopmostTransaction(targetxmin);
-
-				if (TransactionIdEquals(xid, myxid))
-				{
-					/* We wrote it; we already have a write lock. */
-					return;
-				}
-			}
-		}
+		/* If we wrote it; we already have a write lock. */
+		if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tuple->t_data)))
+			return;
 	}
 
 	/*
-- 
2.23.0

v3-0003-Remove-HeapTuple-dependency-for-predicate-locking.patchapplication/octet-stream; name=v3-0003-Remove-HeapTuple-dependency-for-predicate-locking.patchDownload
From 1a9e4dcb430b4a93e87f11e35d4123d9926133fd Mon Sep 17 00:00:00 2001
From: Ashwin Agrawal <aagrawal@pivotal.io>
Date: Fri, 2 Aug 2019 11:44:21 -0700
Subject: [PATCH v3 3/3] Remove HeapTuple dependency for predicate locking
 functions.

The following changes help to make the predicate locking functions
generic and not tied to a particular access method:

- PredicateLockTuple() is renamed to PredicateLockTID().  It takes
  ItemPointer and inserting transaction ID instead of HeapTuple.

- CheckForSerializableConflictIn() takes blocknum instead of buffer.

- CheckForSerializableConflictOut() no longer takes HeapTuple or buffer.

Author: Ashwin Agrawal
Reviewed-by: Andres Freund, Kuntal Ghosh, Thomas Munro
Discussion: https://postgr.es/m/CALfoeiv0k3hkEb3Oqk%3DziWqtyk2Jys1UOK5hwRBNeANT_yX%2Bng%40mail.gmail.com
---
 src/backend/access/gin/ginbtree.c        |   2 +-
 src/backend/access/gin/ginfast.c         |   2 +-
 src/backend/access/gin/gininsert.c       |   6 +-
 src/backend/access/gist/gist.c           |   2 +-
 src/backend/access/hash/hashinsert.c     |   2 +-
 src/backend/access/heap/heapam.c         | 125 +++++++++++++++++++---
 src/backend/access/heap/heapam_handler.c |  11 +-
 src/backend/access/index/indexam.c       |   4 +-
 src/backend/access/nbtree/nbtinsert.c    |   4 +-
 src/backend/storage/lmgr/predicate.c     | 126 +++++++----------------
 src/include/access/heapam.h              |   2 +
 src/include/storage/predicate.h          |   9 +-
 12 files changed, 171 insertions(+), 124 deletions(-)

diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c
index 4c29261256..adeb1d2aa0 100644
--- a/src/backend/access/gin/ginbtree.c
+++ b/src/backend/access/gin/ginbtree.c
@@ -89,7 +89,7 @@ ginFindLeafPage(GinBtree btree, bool searchMode,
 	stack->predictNumber = 1;
 
 	if (rootConflictCheck)
-		CheckForSerializableConflictIn(btree->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(btree->index, NULL, btree->rootBlkno);
 
 	for (;;)
 	{
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 439a91b3e6..d7b5247681 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -246,7 +246,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
 	 * tree, so it conflicts with all serializable scans.  All scans acquire a
 	 * predicate lock on the metabuffer to represent that.
 	 */
-	CheckForSerializableConflictIn(index, NULL, metabuffer);
+	CheckForSerializableConflictIn(index, NULL, GIN_METAPAGE_BLKNO);
 
 	if (collector->sumsize + collector->ntuples * sizeof(ItemIdData) > GinListPageSize)
 	{
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index 6eb83639aa..7f02c5143a 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -217,7 +217,8 @@ ginEntryInsert(GinState *ginstate,
 			return;
 		}
 
-		CheckForSerializableConflictIn(ginstate->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(ginstate->index, NULL,
+									   BufferGetBlockNumber(stack->buffer));
 		/* modify an existing leaf entry */
 		itup = addItemPointersToLeafTuple(ginstate, itup,
 										  items, nitem, buildStats, stack->buffer);
@@ -226,7 +227,8 @@ ginEntryInsert(GinState *ginstate,
 	}
 	else
 	{
-		CheckForSerializableConflictIn(ginstate->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(ginstate->index, NULL,
+									   BufferGetBlockNumber(stack->buffer));
 		/* no match, so construct a new leaf entry */
 		itup = buildFreshLeafTuple(ginstate, attnum, key, category,
 								   items, nitem, buildStats, stack->buffer);
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 0cc87911d6..efb1a0cffa 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -1263,7 +1263,7 @@ gistinserttuples(GISTInsertState *state, GISTInsertStack *stack,
 	 * Check for any rw conflicts (in serializable isolation level) just
 	 * before we intend to modify the page
 	 */
-	CheckForSerializableConflictIn(state->r, NULL, stack->buffer);
+	CheckForSerializableConflictIn(state->r, NULL, BufferGetBlockNumber(stack->buffer));
 
 	/* Insert the tuple(s) to the page, splitting the page if necessary */
 	is_split = gistplacetopage(state->r, state->freespace, giststate,
diff --git a/src/backend/access/hash/hashinsert.c b/src/backend/access/hash/hashinsert.c
index 89876d2ccd..c87fa5970e 100644
--- a/src/backend/access/hash/hashinsert.c
+++ b/src/backend/access/hash/hashinsert.c
@@ -88,7 +88,7 @@ restart_insert:
 										  &usedmetap);
 	Assert(usedmetap != NULL);
 
-	CheckForSerializableConflictIn(rel, NULL, buf);
+	CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(buf));
 
 	/* remember the primary bucket buffer to release the pin on it at end. */
 	bucket_buf = buf;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 0128bb34ef..77bf3a31f1 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -41,6 +41,7 @@
 #include "access/multixact.h"
 #include "access/parallel.h"
 #include "access/relscan.h"
+#include "access/subtrans.h"
 #include "access/sysattr.h"
 #include "access/tableam.h"
 #include "access/transam.h"
@@ -446,8 +447,8 @@ heapgetpage(TableScanDesc sscan, BlockNumber page)
 			else
 				valid = HeapTupleSatisfiesVisibility(&loctup, snapshot, buffer);
 
-			CheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
-											&loctup, buffer, snapshot);
+			HeapCheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
+												&loctup, buffer, snapshot);
 
 			if (valid)
 				scan->rs_vistuples[ntup++] = lineoff;
@@ -668,9 +669,9 @@ heapgettup(HeapScanDesc scan,
 													 snapshot,
 													 scan->rs_cbuf);
 
-				CheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
-												tuple, scan->rs_cbuf,
-												snapshot);
+				HeapCheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
+													tuple, scan->rs_cbuf,
+													snapshot);
 
 				if (valid && key != NULL)
 					HeapKeyTest(tuple, RelationGetDescr(scan->rs_base.rs_rd),
@@ -1477,9 +1478,10 @@ heap_fetch(Relation relation,
 	valid = HeapTupleSatisfiesVisibility(tuple, snapshot, buffer);
 
 	if (valid)
-		PredicateLockTuple(relation, tuple, snapshot);
+		PredicateLockTID(relation, &(tuple->t_self), snapshot,
+						 HeapTupleHeaderGetXmin(tuple->t_data));
 
-	CheckForSerializableConflictOut(valid, relation, tuple, buffer, snapshot);
+	HeapCheckForSerializableConflictOut(valid, relation, tuple, buffer, snapshot);
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
@@ -1610,13 +1612,14 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 		{
 			/* If it's visible per the snapshot, we must return it */
 			valid = HeapTupleSatisfiesVisibility(heapTuple, snapshot, buffer);
-			CheckForSerializableConflictOut(valid, relation, heapTuple,
-											buffer, snapshot);
+			HeapCheckForSerializableConflictOut(valid, relation, heapTuple,
+												buffer, snapshot);
 
 			if (valid)
 			{
 				ItemPointerSetOffsetNumber(tid, offnum);
-				PredicateLockTuple(relation, heapTuple, snapshot);
+				PredicateLockTID(relation, &heapTuple->t_self, snapshot,
+								 HeapTupleHeaderGetXmin(heapTuple->t_data));
 				if (all_dead)
 					*all_dead = false;
 				return true;
@@ -1750,7 +1753,7 @@ heap_get_latest_tid(TableScanDesc sscan,
 		 * candidate.
 		 */
 		valid = HeapTupleSatisfiesVisibility(&tp, snapshot, buffer);
-		CheckForSerializableConflictOut(valid, relation, &tp, buffer, snapshot);
+		HeapCheckForSerializableConflictOut(valid, relation, &tp, buffer, snapshot);
 		if (valid)
 			*tid = ctid;
 
@@ -1905,7 +1908,7 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 	 * lock "gaps" as index page locks do.  So we don't need to specify a
 	 * buffer when making the call, which makes for a faster check.
 	 */
-	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
+	CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber);
 
 	/* NO EREPORT(ERROR) from here till changes are logged */
 	START_CRIT_SECTION();
@@ -2159,7 +2162,7 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 	 * lock "gaps" as index page locks do.  So we don't need to specify a
 	 * buffer when making the call, which makes for a faster check.
 	 */
-	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
+	CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber);
 
 	ndone = 0;
 	while (ndone < ntuples)
@@ -2350,7 +2353,7 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 	 * lock "gaps" as index page locks do.  So we don't need to specify a
 	 * buffer when making the call.
 	 */
-	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
+	CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber);
 
 	/*
 	 * If tuples are cachable, mark them for invalidation from the caches in
@@ -2664,7 +2667,7 @@ l1:
 	 * being visible to the scan (i.e., an exclusive buffer content lock is
 	 * continuously held from this point until the tuple delete is visible).
 	 */
-	CheckForSerializableConflictIn(relation, &tp, buffer);
+	CheckForSerializableConflictIn(relation, tid, BufferGetBlockNumber(buffer));
 
 	/* replace cid with a combo cid if necessary */
 	HeapTupleHeaderAdjustCmax(tp.t_data, &cid, &iscombo);
@@ -3580,7 +3583,7 @@ l2:
 	 * will include checking the relation level, there is no benefit to a
 	 * separate check for the new tuple.
 	 */
-	CheckForSerializableConflictIn(relation, &oldtup, buffer);
+	CheckForSerializableConflictIn(relation, otid, BufferGetBlockNumber(buffer));
 
 	/*
 	 * At this point newbuf and buffer are both pinned and locked, and newbuf
@@ -9043,3 +9046,93 @@ heap_mask(char *pagedata, BlockNumber blkno)
 		}
 	}
 }
+
+/*
+ * HeapCheckForSerializableConflictOut
+ *		We are reading a tuple which has been modified.  If it is visible to
+ *		us but has been deleted, that indicates a rw-conflict out.  If it's
+ *		not visible and was created by a concurrent (overlapping)
+ *		serializable transaction, that is also a rw-conflict out,
+ *
+ * We will determine the top level xid of the writing transaction with which
+ * we may be in conflict, and check for overlap with our own transaction.
+ * If the transactions overlap (i.e., they cannot see each other's writes),
+ * then we have a conflict out.
+ *
+ * This function should be called just about anywhere in heapam.c where a
+ * tuple has been read. The caller must hold at least a shared lock on the
+ * buffer, because this function might set hint bits on the tuple. There is
+ * currently no known reason to call this function from an index AM.
+ */
+void
+HeapCheckForSerializableConflictOut(bool visible, Relation relation,
+									HeapTuple tuple, Buffer buffer,
+									Snapshot snapshot)
+{
+	TransactionId xid;
+	HTSV_Result htsvResult;
+
+	if (!CheckForSerializableConflictOutNeeded(relation, snapshot))
+		return;
+
+	/*
+	 * Check to see whether the tuple has been written to by a concurrent
+	 * transaction, either to create it not visible to us, or to delete it
+	 * while it is visible to us.  The "visible" bool indicates whether the
+	 * tuple is visible to us, while HeapTupleSatisfiesVacuum checks what else
+	 * is going on with it.
+	 */
+	htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
+	switch (htsvResult)
+	{
+		case HEAPTUPLE_LIVE:
+			if (visible)
+				return;
+			xid = HeapTupleHeaderGetXmin(tuple->t_data);
+			break;
+		case HEAPTUPLE_RECENTLY_DEAD:
+			if (!visible)
+				return;
+			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
+			break;
+		case HEAPTUPLE_DELETE_IN_PROGRESS:
+			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
+			break;
+		case HEAPTUPLE_INSERT_IN_PROGRESS:
+			xid = HeapTupleHeaderGetXmin(tuple->t_data);
+			break;
+		case HEAPTUPLE_DEAD:
+			return;
+		default:
+
+			/*
+			 * The only way to get to this default clause is if a new value is
+			 * added to the enum type without adding it to this switch
+			 * statement.  That's a bug, so elog.
+			 */
+			elog(ERROR, "unrecognized return value from HeapTupleSatisfiesVacuum: %u", htsvResult);
+
+			/*
+			 * In spite of having all enum values covered and calling elog on
+			 * this default, some compilers think this is a code path which
+			 * allows xid to be used below without initialization. Silence
+			 * that warning.
+			 */
+			xid = InvalidTransactionId;
+	}
+
+	Assert(TransactionIdIsValid(xid));
+	Assert(TransactionIdFollowsOrEquals(xid, TransactionXmin));
+
+	/*
+	 * Find top level xid.  Bail out if xid is too early to be a conflict, or
+	 * if it's our own xid.
+	 */
+	if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
+		return;
+	xid = SubTransGetTopmostTransaction(xid);
+	if (TransactionIdPrecedes(xid, TransactionXmin))
+		return;
+
+	return CheckForSerializableConflictOut(relation, xid, snapshot);
+}
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 2dd8821fac..c3ee2f0154 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2166,10 +2166,11 @@ heapam_scan_bitmap_next_block(TableScanDesc scan,
 			if (valid)
 			{
 				hscan->rs_vistuples[ntup++] = offnum;
-				PredicateLockTuple(scan->rs_rd, &loctup, snapshot);
+				PredicateLockTID(scan->rs_rd, &loctup.t_self, snapshot,
+								 HeapTupleHeaderGetXmin(loctup.t_data));
 			}
-			CheckForSerializableConflictOut(valid, scan->rs_rd, &loctup,
-											buffer, snapshot);
+			HeapCheckForSerializableConflictOut(valid, scan->rs_rd, &loctup,
+												buffer, snapshot);
 		}
 	}
 
@@ -2356,8 +2357,8 @@ heapam_scan_sample_next_tuple(TableScanDesc scan, SampleScanState *scanstate,
 
 			/* in pagemode, heapgetpage did this for us */
 			if (!pagemode)
-				CheckForSerializableConflictOut(visible, scan->rs_rd, tuple,
-												hscan->rs_cbuf, scan->rs_snapshot);
+				HeapCheckForSerializableConflictOut(visible, scan->rs_rd, tuple,
+													hscan->rs_cbuf, scan->rs_snapshot);
 
 			/* Try next tuple from same page. */
 			if (!visible)
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 9dfa0ddfbb..d67afe141d 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -180,8 +180,8 @@ index_insert(Relation indexRelation,
 
 	if (!(indexRelation->rd_indam->ampredlocks))
 		CheckForSerializableConflictIn(indexRelation,
-									   (HeapTuple) NULL,
-									   InvalidBuffer);
+									   (ItemPointer) NULL,
+									   InvalidBlockNumber);
 
 	return indexRelation->rd_indam->aminsert(indexRelation, values, isnull,
 											 heap_t_ctid, heapRelation,
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index b84bf1c3df..f8afc2beea 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -290,7 +290,7 @@ top:
 		 * checkingunique and !heapkeyspace cases, but it's okay to use the
 		 * first page the value could be on (with scantid omitted) instead.
 		 */
-		CheckForSerializableConflictIn(rel, NULL, insertstate.buf);
+		CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(insertstate.buf));
 
 		/*
 		 * Do the insertion.  Note that insertstate contains cached binary
@@ -533,7 +533,7 @@ _bt_check_unique(Relation rel, BTInsertState insertstate, Relation heapRel,
 					 * otherwise be masked by this unique constraint
 					 * violation.
 					 */
-					CheckForSerializableConflictIn(rel, NULL, insertstate->buf);
+					CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(insertstate->buf));
 
 					/*
 					 * This is a definite conflict.  Break the tuple down into
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index d5468b3e29..812aa496f4 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -163,8 +163,8 @@
  *		PredicateLockRelation(Relation relation, Snapshot snapshot)
  *		PredicateLockPage(Relation relation, BlockNumber blkno,
  *						Snapshot snapshot)
- *		PredicateLockTuple(Relation relation, HeapTuple tuple,
- *						Snapshot snapshot)
+ *		PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
+ *						 TransactionId insert_xid)
  *		PredicateLockPageSplit(Relation relation, BlockNumber oldblkno,
  *							   BlockNumber newblkno)
  *		PredicateLockPageCombine(Relation relation, BlockNumber oldblkno,
@@ -173,11 +173,10 @@
  *		ReleasePredicateLocks(bool isCommit, bool isReadOnlySafe)
  *
  * conflict detection (may also trigger rollback)
- *		CheckForSerializableConflictOut(bool visible, Relation relation,
- *										HeapTupleData *tup, Buffer buffer,
+ *		CheckForSerializableConflictOut(Relation relation, TransactionId xid,
  *										Snapshot snapshot)
- *		CheckForSerializableConflictIn(Relation relation, HeapTupleData *tup,
- *									   Buffer buffer)
+ *		CheckForSerializableConflictIn(Relation relation, ItemPointer tid,
+ *									   BlockNumber blkno)
  *		CheckTableForSerializableConflictIn(Relation relation)
  *
  * final rollback checking
@@ -193,8 +192,6 @@
 
 #include "postgres.h"
 
-#include "access/heapam.h"
-#include "access/htup_details.h"
 #include "access/parallel.h"
 #include "access/slru.h"
 #include "access/subtrans.h"
@@ -2538,29 +2535,28 @@ PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot)
 }
 
 /*
- *		PredicateLockTuple
+ *		PredicateLockTID
  *
  * Gets a predicate lock at the tuple level.
  * Skip if not in full serializable transaction isolation level.
  * Skip if this is a temporary table.
  */
 void
-PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot)
+PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
+				 TransactionId tuple_xid)
 {
 	PREDICATELOCKTARGETTAG tag;
-	ItemPointer tid;
-	TransactionId targetxmin;
 
 	if (!SerializationNeededForRead(relation, snapshot))
 		return;
 
 	/*
-	 * If it's a heap tuple, return if this xact wrote it.
+	 * Return if this xact wrote it.
 	 */
 	if (relation->rd_index == NULL)
 	{
 		/* If we wrote it; we already have a write lock. */
-		if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tuple->t_data)))
+		if (TransactionIdIsCurrentTransactionId(tuple_xid))
 			return;
 	}
 
@@ -2576,7 +2572,6 @@ PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot)
 	if (PredicateLockExists(&tag))
 		return;
 
-	tid = &(tuple->t_self);
 	SET_PREDICATELOCKTARGETTAG_TUPLE(tag,
 									 relation->rd_node.dbNode,
 									 relation->rd_id,
@@ -4021,33 +4016,43 @@ XidIsConcurrent(TransactionId xid)
 	return false;
 }
 
+bool
+CheckForSerializableConflictOutNeeded(Relation relation, Snapshot snapshot)
+{
+	if (!SerializationNeededForRead(relation, snapshot))
+		return false;
+
+	/* Check if someone else has already decided that we need to die */
+	if (SxactIsDoomed(MySerializableXact))
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("could not serialize access due to read/write dependencies among transactions"),
+				 errdetail_internal("Reason code: Canceled on identification as a pivot, during conflict out checking."),
+				 errhint("The transaction might succeed if retried.")));
+	}
+
+	return true;
+}
+
 /*
  * CheckForSerializableConflictOut
  *		We are reading a tuple which has been modified.  If it is visible to
  *		us but has been deleted, that indicates a rw-conflict out.  If it's
  *		not visible and was created by a concurrent (overlapping)
- *		serializable transaction, that is also a rw-conflict out,
+ *		serializable transaction, that is also a rw-conflict out.
  *
  * We will determine the top level xid of the writing transaction with which
  * we may be in conflict, and check for overlap with our own transaction.
  * If the transactions overlap (i.e., they cannot see each other's writes),
  * then we have a conflict out.
- *
- * This function should be called just about anywhere in heapam.c where a
- * tuple has been read. The caller must hold at least a shared lock on the
- * buffer, because this function might set hint bits on the tuple. There is
- * currently no known reason to call this function from an index AM.
  */
 void
-CheckForSerializableConflictOut(bool visible, Relation relation,
-								HeapTuple tuple, Buffer buffer,
-								Snapshot snapshot)
+CheckForSerializableConflictOut(Relation relation, TransactionId xid, Snapshot snapshot)
 {
-	TransactionId xid;
 	SERIALIZABLEXIDTAG sxidtag;
 	SERIALIZABLEXID *sxid;
 	SERIALIZABLEXACT *sxact;
-	HTSV_Result htsvResult;
 
 	if (!SerializationNeededForRead(relation, snapshot))
 		return;
@@ -4061,64 +4066,8 @@ CheckForSerializableConflictOut(bool visible, Relation relation,
 				 errdetail_internal("Reason code: Canceled on identification as a pivot, during conflict out checking."),
 				 errhint("The transaction might succeed if retried.")));
 	}
-
-	/*
-	 * Check to see whether the tuple has been written to by a concurrent
-	 * transaction, either to create it not visible to us, or to delete it
-	 * while it is visible to us.  The "visible" bool indicates whether the
-	 * tuple is visible to us, while HeapTupleSatisfiesVacuum checks what else
-	 * is going on with it.
-	 */
-	htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
-	switch (htsvResult)
-	{
-		case HEAPTUPLE_LIVE:
-			if (visible)
-				return;
-			xid = HeapTupleHeaderGetXmin(tuple->t_data);
-			break;
-		case HEAPTUPLE_RECENTLY_DEAD:
-			if (!visible)
-				return;
-			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
-			break;
-		case HEAPTUPLE_DELETE_IN_PROGRESS:
-			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
-			break;
-		case HEAPTUPLE_INSERT_IN_PROGRESS:
-			xid = HeapTupleHeaderGetXmin(tuple->t_data);
-			break;
-		case HEAPTUPLE_DEAD:
-			return;
-		default:
-
-			/*
-			 * The only way to get to this default clause is if a new value is
-			 * added to the enum type without adding it to this switch
-			 * statement.  That's a bug, so elog.
-			 */
-			elog(ERROR, "unrecognized return value from HeapTupleSatisfiesVacuum: %u", htsvResult);
-
-			/*
-			 * In spite of having all enum values covered and calling elog on
-			 * this default, some compilers think this is a code path which
-			 * allows xid to be used below without initialization. Silence
-			 * that warning.
-			 */
-			xid = InvalidTransactionId;
-	}
 	Assert(TransactionIdIsValid(xid));
-	Assert(TransactionIdFollowsOrEquals(xid, TransactionXmin));
 
-	/*
-	 * Find top level xid.  Bail out if xid is too early to be a conflict, or
-	 * if it's our own xid.
-	 */
-	if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
-		return;
-	xid = SubTransGetTopmostTransaction(xid);
-	if (TransactionIdPrecedes(xid, TransactionXmin))
-		return;
 	if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
 		return;
 
@@ -4424,8 +4373,7 @@ CheckTargetForConflictsIn(PREDICATELOCKTARGETTAG *targettag)
  * tuple itself.
  */
 void
-CheckForSerializableConflictIn(Relation relation, HeapTuple tuple,
-							   Buffer buffer)
+CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber blkno)
 {
 	PREDICATELOCKTARGETTAG targettag;
 
@@ -4455,22 +4403,22 @@ CheckForSerializableConflictIn(Relation relation, HeapTuple tuple,
 	 * It is not possible to take and hold a lock across the checks for all
 	 * granularities because each target could be in a separate partition.
 	 */
-	if (tuple != NULL)
+	if (tid != NULL)
 	{
 		SET_PREDICATELOCKTARGETTAG_TUPLE(targettag,
 										 relation->rd_node.dbNode,
 										 relation->rd_id,
-										 ItemPointerGetBlockNumber(&(tuple->t_self)),
-										 ItemPointerGetOffsetNumber(&(tuple->t_self)));
+										 ItemPointerGetBlockNumber(tid),
+										 ItemPointerGetOffsetNumber(tid));
 		CheckTargetForConflictsIn(&targettag);
 	}
 
-	if (BufferIsValid(buffer))
+	if (blkno != InvalidBlockNumber)
 	{
 		SET_PREDICATELOCKTARGETTAG_PAGE(targettag,
 										relation->rd_node.dbNode,
 										relation->rd_id,
-										BufferGetBlockNumber(buffer));
+										blkno);
 		CheckTargetForConflictsIn(&targettag);
 	}
 
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..390023c0a5 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -217,5 +217,7 @@ extern bool ResolveCminCmaxDuringDecoding(struct HTAB *tuplecid_data,
 										  HeapTuple htup,
 										  Buffer buffer,
 										  CommandId *cmin, CommandId *cmax);
+extern void HeapCheckForSerializableConflictOut(bool valid, Relation relation, HeapTuple tuple,
+												Buffer buffer, Snapshot snapshot);
 
 #endif							/* HEAPAM_H */
diff --git a/src/include/storage/predicate.h b/src/include/storage/predicate.h
index 376245ecd7..7e1f67b8ed 100644
--- a/src/include/storage/predicate.h
+++ b/src/include/storage/predicate.h
@@ -57,16 +57,17 @@ extern void SetSerializableTransactionSnapshot(Snapshot snapshot,
 extern void RegisterPredicateLockingXid(TransactionId xid);
 extern void PredicateLockRelation(Relation relation, Snapshot snapshot);
 extern void PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot);
-extern void PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot);
+extern void PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
+							 TransactionId insert_xid);
 extern void PredicateLockPageSplit(Relation relation, BlockNumber oldblkno, BlockNumber newblkno);
 extern void PredicateLockPageCombine(Relation relation, BlockNumber oldblkno, BlockNumber newblkno);
 extern void TransferPredicateLocksToHeapRelation(Relation relation);
 extern void ReleasePredicateLocks(bool isCommit, bool isReadOnlySafe);
 
 /* conflict detection (may also trigger rollback) */
-extern void CheckForSerializableConflictOut(bool valid, Relation relation, HeapTuple tuple,
-											Buffer buffer, Snapshot snapshot);
-extern void CheckForSerializableConflictIn(Relation relation, HeapTuple tuple, Buffer buffer);
+extern bool CheckForSerializableConflictOutNeeded(Relation relation, Snapshot snapshot);
+extern void CheckForSerializableConflictOut(Relation relation, TransactionId xid, Snapshot snapshot);
+extern void CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber blkno);
 extern void CheckTableForSerializableConflictIn(Relation relation);
 
 /* final rollback checking */
-- 
2.23.0

#21Ashwin Agrawal
aagrawal@pivotal.io
In reply to: Thomas Munro (#20)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

On Thu, Nov 7, 2019 at 8:44 PM Thomas Munro <thomas.munro@gmail.com> wrote:

On Thu, Aug 8, 2019 at 6:53 AM Ashwin Agrawal <aagrawal@pivotal.io> wrote:

- I wonder if CheckForSerializableConflictOutNeeded() shouldn't have a
portion of it's code as a static inline. In particular, it's a shame
that we currently perform external function calls at quite the
frequency when serializable isn't even in use.

I am not sure on portion of the code part? SerializationNeededForRead()

is static inline function in C file. Can't inline
CheckForSerializableConflictOutNeeded() without moving
SerializationNeededForRead() and some other variables to header file.
CheckForSerializableConflictOut() wasn't inline either, so a function call
was performed earlier as well when serializable isn't even in use.

I agree that it's strange that we do these high frequency function
calls just to figure out that we're not even using this stuff, which
ultimately comes down to the static global variable MySerializableXact
being not reachable from (say) an inline function defined in a header.
That's something to look into in another thread.

Attaching re-based version of the patches on top of current master,

which has the fix for HOT serializable predicate locking bug spotted by
Andres committed now.

I'm planning to commit these three patches on Monday. I've attached
versions with whitespace-only changes from pgindent, and commit
messages lightly massaged and updated to point to this discussion and
reviewers.

Thanks a lot, sounds good.

#22Thomas Munro
thomas.munro@gmail.com
In reply to: Ashwin Agrawal (#21)
2 attachment(s)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

On Sat, Nov 9, 2019 at 8:41 AM Ashwin Agrawal <aagrawal@pivotal.io> wrote:

On Thu, Nov 7, 2019 at 8:44 PM Thomas Munro <thomas.munro@gmail.com> wrote:

I'm planning to commit these three patches on Monday. I've attached
versions with whitespace-only changes from pgindent, and commit
messages lightly massaged and updated to point to this discussion and
reviewers.

Thanks a lot, sounds good.

Hi Ashwin,

I pushed the first two, but on another read-through of the main patch
I didn't like the comments for CheckForSerializableConflictOut() or
the fact that it checks SerializationNeededForRead() again, after I
thought a bit about what the contract for this API is now. Here's a
version with small fixup that I'd like to squash into the patch.
Please let me know what you think, or if you see how to improve it
further.

Attachments:

v4-0001-Remove-dependency-on-HeapTuple-from-predicate-loc.patchapplication/octet-stream; name=v4-0001-Remove-dependency-on-HeapTuple-from-predicate-loc.patchDownload
From 3b63000f0485cf08cbffabeaa736ca92b7847fec Mon Sep 17 00:00:00 2001
From: Thomas Munro <tmunro@postgresql.org>
Date: Mon, 11 Nov 2019 16:35:58 +1300
Subject: [PATCH v4 1/2] Remove dependency on HeapTuple from predicate locking
 functions.

The following changes make the predicate locking functions more
generic and suitable for use by future access methods:

- PredicateLockTuple() is renamed to PredicateLockTID().  It takes
  ItemPointer and inserting transaction ID instead of HeapTuple.

- CheckForSerializableConflictIn() takes blocknum instead of buffer.

- CheckForSerializableConflictOut() no longer takes HeapTuple or buffer.

Author: Ashwin Agrawal
Reviewed-by: Andres Freund, Kuntal Ghosh, Thomas Munro
Discussion: https://postgr.es/m/CALfoeiv0k3hkEb3Oqk%3DziWqtyk2Jys1UOK5hwRBNeANT_yX%2Bng%40mail.gmail.com
---
 src/backend/access/gin/ginbtree.c        |   2 +-
 src/backend/access/gin/ginfast.c         |   2 +-
 src/backend/access/gin/gininsert.c       |   6 +-
 src/backend/access/gist/gist.c           |   2 +-
 src/backend/access/hash/hashinsert.c     |   2 +-
 src/backend/access/heap/heapam.c         | 125 ++++++++++++++++++++---
 src/backend/access/heap/heapam_handler.c |  11 +-
 src/backend/access/index/indexam.c       |   4 +-
 src/backend/access/nbtree/nbtinsert.c    |   4 +-
 src/backend/storage/lmgr/predicate.c     | 125 +++++++----------------
 src/include/access/heapam.h              |   2 +
 src/include/storage/predicate.h          |   9 +-
 12 files changed, 171 insertions(+), 123 deletions(-)

diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c
index 4c29261256..adeb1d2aa0 100644
--- a/src/backend/access/gin/ginbtree.c
+++ b/src/backend/access/gin/ginbtree.c
@@ -89,7 +89,7 @@ ginFindLeafPage(GinBtree btree, bool searchMode,
 	stack->predictNumber = 1;
 
 	if (rootConflictCheck)
-		CheckForSerializableConflictIn(btree->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(btree->index, NULL, btree->rootBlkno);
 
 	for (;;)
 	{
diff --git a/src/backend/access/gin/ginfast.c b/src/backend/access/gin/ginfast.c
index 439a91b3e6..d7b5247681 100644
--- a/src/backend/access/gin/ginfast.c
+++ b/src/backend/access/gin/ginfast.c
@@ -246,7 +246,7 @@ ginHeapTupleFastInsert(GinState *ginstate, GinTupleCollector *collector)
 	 * tree, so it conflicts with all serializable scans.  All scans acquire a
 	 * predicate lock on the metabuffer to represent that.
 	 */
-	CheckForSerializableConflictIn(index, NULL, metabuffer);
+	CheckForSerializableConflictIn(index, NULL, GIN_METAPAGE_BLKNO);
 
 	if (collector->sumsize + collector->ntuples * sizeof(ItemIdData) > GinListPageSize)
 	{
diff --git a/src/backend/access/gin/gininsert.c b/src/backend/access/gin/gininsert.c
index d2905818b2..f66511bbcd 100644
--- a/src/backend/access/gin/gininsert.c
+++ b/src/backend/access/gin/gininsert.c
@@ -217,7 +217,8 @@ ginEntryInsert(GinState *ginstate,
 			return;
 		}
 
-		CheckForSerializableConflictIn(ginstate->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(ginstate->index, NULL,
+									   BufferGetBlockNumber(stack->buffer));
 		/* modify an existing leaf entry */
 		itup = addItemPointersToLeafTuple(ginstate, itup,
 										  items, nitem, buildStats, stack->buffer);
@@ -226,7 +227,8 @@ ginEntryInsert(GinState *ginstate,
 	}
 	else
 	{
-		CheckForSerializableConflictIn(ginstate->index, NULL, stack->buffer);
+		CheckForSerializableConflictIn(ginstate->index, NULL,
+									   BufferGetBlockNumber(stack->buffer));
 		/* no match, so construct a new leaf entry */
 		itup = buildFreshLeafTuple(ginstate, attnum, key, category,
 								   items, nitem, buildStats, stack->buffer);
diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 0cc87911d6..efb1a0cffa 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -1263,7 +1263,7 @@ gistinserttuples(GISTInsertState *state, GISTInsertStack *stack,
 	 * Check for any rw conflicts (in serializable isolation level) just
 	 * before we intend to modify the page
 	 */
-	CheckForSerializableConflictIn(state->r, NULL, stack->buffer);
+	CheckForSerializableConflictIn(state->r, NULL, BufferGetBlockNumber(stack->buffer));
 
 	/* Insert the tuple(s) to the page, splitting the page if necessary */
 	is_split = gistplacetopage(state->r, state->freespace, giststate,
diff --git a/src/backend/access/hash/hashinsert.c b/src/backend/access/hash/hashinsert.c
index 89876d2ccd..c87fa5970e 100644
--- a/src/backend/access/hash/hashinsert.c
+++ b/src/backend/access/hash/hashinsert.c
@@ -88,7 +88,7 @@ restart_insert:
 										  &usedmetap);
 	Assert(usedmetap != NULL);
 
-	CheckForSerializableConflictIn(rel, NULL, buf);
+	CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(buf));
 
 	/* remember the primary bucket buffer to release the pin on it at end. */
 	bucket_buf = buf;
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index 0128bb34ef..77bf3a31f1 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -41,6 +41,7 @@
 #include "access/multixact.h"
 #include "access/parallel.h"
 #include "access/relscan.h"
+#include "access/subtrans.h"
 #include "access/sysattr.h"
 #include "access/tableam.h"
 #include "access/transam.h"
@@ -446,8 +447,8 @@ heapgetpage(TableScanDesc sscan, BlockNumber page)
 			else
 				valid = HeapTupleSatisfiesVisibility(&loctup, snapshot, buffer);
 
-			CheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
-											&loctup, buffer, snapshot);
+			HeapCheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
+												&loctup, buffer, snapshot);
 
 			if (valid)
 				scan->rs_vistuples[ntup++] = lineoff;
@@ -668,9 +669,9 @@ heapgettup(HeapScanDesc scan,
 													 snapshot,
 													 scan->rs_cbuf);
 
-				CheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
-												tuple, scan->rs_cbuf,
-												snapshot);
+				HeapCheckForSerializableConflictOut(valid, scan->rs_base.rs_rd,
+													tuple, scan->rs_cbuf,
+													snapshot);
 
 				if (valid && key != NULL)
 					HeapKeyTest(tuple, RelationGetDescr(scan->rs_base.rs_rd),
@@ -1477,9 +1478,10 @@ heap_fetch(Relation relation,
 	valid = HeapTupleSatisfiesVisibility(tuple, snapshot, buffer);
 
 	if (valid)
-		PredicateLockTuple(relation, tuple, snapshot);
+		PredicateLockTID(relation, &(tuple->t_self), snapshot,
+						 HeapTupleHeaderGetXmin(tuple->t_data));
 
-	CheckForSerializableConflictOut(valid, relation, tuple, buffer, snapshot);
+	HeapCheckForSerializableConflictOut(valid, relation, tuple, buffer, snapshot);
 
 	LockBuffer(buffer, BUFFER_LOCK_UNLOCK);
 
@@ -1610,13 +1612,14 @@ heap_hot_search_buffer(ItemPointer tid, Relation relation, Buffer buffer,
 		{
 			/* If it's visible per the snapshot, we must return it */
 			valid = HeapTupleSatisfiesVisibility(heapTuple, snapshot, buffer);
-			CheckForSerializableConflictOut(valid, relation, heapTuple,
-											buffer, snapshot);
+			HeapCheckForSerializableConflictOut(valid, relation, heapTuple,
+												buffer, snapshot);
 
 			if (valid)
 			{
 				ItemPointerSetOffsetNumber(tid, offnum);
-				PredicateLockTuple(relation, heapTuple, snapshot);
+				PredicateLockTID(relation, &heapTuple->t_self, snapshot,
+								 HeapTupleHeaderGetXmin(heapTuple->t_data));
 				if (all_dead)
 					*all_dead = false;
 				return true;
@@ -1750,7 +1753,7 @@ heap_get_latest_tid(TableScanDesc sscan,
 		 * candidate.
 		 */
 		valid = HeapTupleSatisfiesVisibility(&tp, snapshot, buffer);
-		CheckForSerializableConflictOut(valid, relation, &tp, buffer, snapshot);
+		HeapCheckForSerializableConflictOut(valid, relation, &tp, buffer, snapshot);
 		if (valid)
 			*tid = ctid;
 
@@ -1905,7 +1908,7 @@ heap_insert(Relation relation, HeapTuple tup, CommandId cid,
 	 * lock "gaps" as index page locks do.  So we don't need to specify a
 	 * buffer when making the call, which makes for a faster check.
 	 */
-	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
+	CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber);
 
 	/* NO EREPORT(ERROR) from here till changes are logged */
 	START_CRIT_SECTION();
@@ -2159,7 +2162,7 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 	 * lock "gaps" as index page locks do.  So we don't need to specify a
 	 * buffer when making the call, which makes for a faster check.
 	 */
-	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
+	CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber);
 
 	ndone = 0;
 	while (ndone < ntuples)
@@ -2350,7 +2353,7 @@ heap_multi_insert(Relation relation, TupleTableSlot **slots, int ntuples,
 	 * lock "gaps" as index page locks do.  So we don't need to specify a
 	 * buffer when making the call.
 	 */
-	CheckForSerializableConflictIn(relation, NULL, InvalidBuffer);
+	CheckForSerializableConflictIn(relation, NULL, InvalidBlockNumber);
 
 	/*
 	 * If tuples are cachable, mark them for invalidation from the caches in
@@ -2664,7 +2667,7 @@ l1:
 	 * being visible to the scan (i.e., an exclusive buffer content lock is
 	 * continuously held from this point until the tuple delete is visible).
 	 */
-	CheckForSerializableConflictIn(relation, &tp, buffer);
+	CheckForSerializableConflictIn(relation, tid, BufferGetBlockNumber(buffer));
 
 	/* replace cid with a combo cid if necessary */
 	HeapTupleHeaderAdjustCmax(tp.t_data, &cid, &iscombo);
@@ -3580,7 +3583,7 @@ l2:
 	 * will include checking the relation level, there is no benefit to a
 	 * separate check for the new tuple.
 	 */
-	CheckForSerializableConflictIn(relation, &oldtup, buffer);
+	CheckForSerializableConflictIn(relation, otid, BufferGetBlockNumber(buffer));
 
 	/*
 	 * At this point newbuf and buffer are both pinned and locked, and newbuf
@@ -9043,3 +9046,93 @@ heap_mask(char *pagedata, BlockNumber blkno)
 		}
 	}
 }
+
+/*
+ * HeapCheckForSerializableConflictOut
+ *		We are reading a tuple which has been modified.  If it is visible to
+ *		us but has been deleted, that indicates a rw-conflict out.  If it's
+ *		not visible and was created by a concurrent (overlapping)
+ *		serializable transaction, that is also a rw-conflict out,
+ *
+ * We will determine the top level xid of the writing transaction with which
+ * we may be in conflict, and check for overlap with our own transaction.
+ * If the transactions overlap (i.e., they cannot see each other's writes),
+ * then we have a conflict out.
+ *
+ * This function should be called just about anywhere in heapam.c where a
+ * tuple has been read. The caller must hold at least a shared lock on the
+ * buffer, because this function might set hint bits on the tuple. There is
+ * currently no known reason to call this function from an index AM.
+ */
+void
+HeapCheckForSerializableConflictOut(bool visible, Relation relation,
+									HeapTuple tuple, Buffer buffer,
+									Snapshot snapshot)
+{
+	TransactionId xid;
+	HTSV_Result htsvResult;
+
+	if (!CheckForSerializableConflictOutNeeded(relation, snapshot))
+		return;
+
+	/*
+	 * Check to see whether the tuple has been written to by a concurrent
+	 * transaction, either to create it not visible to us, or to delete it
+	 * while it is visible to us.  The "visible" bool indicates whether the
+	 * tuple is visible to us, while HeapTupleSatisfiesVacuum checks what else
+	 * is going on with it.
+	 */
+	htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
+	switch (htsvResult)
+	{
+		case HEAPTUPLE_LIVE:
+			if (visible)
+				return;
+			xid = HeapTupleHeaderGetXmin(tuple->t_data);
+			break;
+		case HEAPTUPLE_RECENTLY_DEAD:
+			if (!visible)
+				return;
+			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
+			break;
+		case HEAPTUPLE_DELETE_IN_PROGRESS:
+			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
+			break;
+		case HEAPTUPLE_INSERT_IN_PROGRESS:
+			xid = HeapTupleHeaderGetXmin(tuple->t_data);
+			break;
+		case HEAPTUPLE_DEAD:
+			return;
+		default:
+
+			/*
+			 * The only way to get to this default clause is if a new value is
+			 * added to the enum type without adding it to this switch
+			 * statement.  That's a bug, so elog.
+			 */
+			elog(ERROR, "unrecognized return value from HeapTupleSatisfiesVacuum: %u", htsvResult);
+
+			/*
+			 * In spite of having all enum values covered and calling elog on
+			 * this default, some compilers think this is a code path which
+			 * allows xid to be used below without initialization. Silence
+			 * that warning.
+			 */
+			xid = InvalidTransactionId;
+	}
+
+	Assert(TransactionIdIsValid(xid));
+	Assert(TransactionIdFollowsOrEquals(xid, TransactionXmin));
+
+	/*
+	 * Find top level xid.  Bail out if xid is too early to be a conflict, or
+	 * if it's our own xid.
+	 */
+	if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
+		return;
+	xid = SubTransGetTopmostTransaction(xid);
+	if (TransactionIdPrecedes(xid, TransactionXmin))
+		return;
+
+	return CheckForSerializableConflictOut(relation, xid, snapshot);
+}
diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index 7081172dcc..9d3f366998 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -2165,10 +2165,11 @@ heapam_scan_bitmap_next_block(TableScanDesc scan,
 			if (valid)
 			{
 				hscan->rs_vistuples[ntup++] = offnum;
-				PredicateLockTuple(scan->rs_rd, &loctup, snapshot);
+				PredicateLockTID(scan->rs_rd, &loctup.t_self, snapshot,
+								 HeapTupleHeaderGetXmin(loctup.t_data));
 			}
-			CheckForSerializableConflictOut(valid, scan->rs_rd, &loctup,
-											buffer, snapshot);
+			HeapCheckForSerializableConflictOut(valid, scan->rs_rd, &loctup,
+												buffer, snapshot);
 		}
 	}
 
@@ -2355,8 +2356,8 @@ heapam_scan_sample_next_tuple(TableScanDesc scan, SampleScanState *scanstate,
 
 			/* in pagemode, heapgetpage did this for us */
 			if (!pagemode)
-				CheckForSerializableConflictOut(visible, scan->rs_rd, tuple,
-												hscan->rs_cbuf, scan->rs_snapshot);
+				HeapCheckForSerializableConflictOut(visible, scan->rs_rd, tuple,
+													hscan->rs_cbuf, scan->rs_snapshot);
 
 			/* Try next tuple from same page. */
 			if (!visible)
diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index 9dfa0ddfbb..d67afe141d 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -180,8 +180,8 @@ index_insert(Relation indexRelation,
 
 	if (!(indexRelation->rd_indam->ampredlocks))
 		CheckForSerializableConflictIn(indexRelation,
-									   (HeapTuple) NULL,
-									   InvalidBuffer);
+									   (ItemPointer) NULL,
+									   InvalidBlockNumber);
 
 	return indexRelation->rd_indam->aminsert(indexRelation, values, isnull,
 											 heap_t_ctid, heapRelation,
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index b84bf1c3df..f8afc2beea 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -290,7 +290,7 @@ top:
 		 * checkingunique and !heapkeyspace cases, but it's okay to use the
 		 * first page the value could be on (with scantid omitted) instead.
 		 */
-		CheckForSerializableConflictIn(rel, NULL, insertstate.buf);
+		CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(insertstate.buf));
 
 		/*
 		 * Do the insertion.  Note that insertstate contains cached binary
@@ -533,7 +533,7 @@ _bt_check_unique(Relation rel, BTInsertState insertstate, Relation heapRel,
 					 * otherwise be masked by this unique constraint
 					 * violation.
 					 */
-					CheckForSerializableConflictIn(rel, NULL, insertstate->buf);
+					CheckForSerializableConflictIn(rel, NULL, BufferGetBlockNumber(insertstate->buf));
 
 					/*
 					 * This is a definite conflict.  Break the tuple down into
diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 86b46d68b9..812aa496f4 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -163,8 +163,8 @@
  *		PredicateLockRelation(Relation relation, Snapshot snapshot)
  *		PredicateLockPage(Relation relation, BlockNumber blkno,
  *						Snapshot snapshot)
- *		PredicateLockTuple(Relation relation, HeapTuple tuple,
- *						Snapshot snapshot)
+ *		PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
+ *						 TransactionId insert_xid)
  *		PredicateLockPageSplit(Relation relation, BlockNumber oldblkno,
  *							   BlockNumber newblkno)
  *		PredicateLockPageCombine(Relation relation, BlockNumber oldblkno,
@@ -173,11 +173,10 @@
  *		ReleasePredicateLocks(bool isCommit, bool isReadOnlySafe)
  *
  * conflict detection (may also trigger rollback)
- *		CheckForSerializableConflictOut(bool visible, Relation relation,
- *										HeapTupleData *tup, Buffer buffer,
+ *		CheckForSerializableConflictOut(Relation relation, TransactionId xid,
  *										Snapshot snapshot)
- *		CheckForSerializableConflictIn(Relation relation, HeapTupleData *tup,
- *									   Buffer buffer)
+ *		CheckForSerializableConflictIn(Relation relation, ItemPointer tid,
+ *									   BlockNumber blkno)
  *		CheckTableForSerializableConflictIn(Relation relation)
  *
  * final rollback checking
@@ -193,8 +192,6 @@
 
 #include "postgres.h"
 
-#include "access/heapam.h"
-#include "access/htup_details.h"
 #include "access/parallel.h"
 #include "access/slru.h"
 #include "access/subtrans.h"
@@ -2538,28 +2535,28 @@ PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot)
 }
 
 /*
- *		PredicateLockTuple
+ *		PredicateLockTID
  *
  * Gets a predicate lock at the tuple level.
  * Skip if not in full serializable transaction isolation level.
  * Skip if this is a temporary table.
  */
 void
-PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot)
+PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
+				 TransactionId tuple_xid)
 {
 	PREDICATELOCKTARGETTAG tag;
-	ItemPointer tid;
 
 	if (!SerializationNeededForRead(relation, snapshot))
 		return;
 
 	/*
-	 * If it's a heap tuple, return if this xact wrote it.
+	 * Return if this xact wrote it.
 	 */
 	if (relation->rd_index == NULL)
 	{
 		/* If we wrote it; we already have a write lock. */
-		if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tuple->t_data)))
+		if (TransactionIdIsCurrentTransactionId(tuple_xid))
 			return;
 	}
 
@@ -2575,7 +2572,6 @@ PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot)
 	if (PredicateLockExists(&tag))
 		return;
 
-	tid = &(tuple->t_self);
 	SET_PREDICATELOCKTARGETTAG_TUPLE(tag,
 									 relation->rd_node.dbNode,
 									 relation->rd_id,
@@ -4020,33 +4016,43 @@ XidIsConcurrent(TransactionId xid)
 	return false;
 }
 
+bool
+CheckForSerializableConflictOutNeeded(Relation relation, Snapshot snapshot)
+{
+	if (!SerializationNeededForRead(relation, snapshot))
+		return false;
+
+	/* Check if someone else has already decided that we need to die */
+	if (SxactIsDoomed(MySerializableXact))
+	{
+		ereport(ERROR,
+				(errcode(ERRCODE_T_R_SERIALIZATION_FAILURE),
+				 errmsg("could not serialize access due to read/write dependencies among transactions"),
+				 errdetail_internal("Reason code: Canceled on identification as a pivot, during conflict out checking."),
+				 errhint("The transaction might succeed if retried.")));
+	}
+
+	return true;
+}
+
 /*
  * CheckForSerializableConflictOut
  *		We are reading a tuple which has been modified.  If it is visible to
  *		us but has been deleted, that indicates a rw-conflict out.  If it's
  *		not visible and was created by a concurrent (overlapping)
- *		serializable transaction, that is also a rw-conflict out,
+ *		serializable transaction, that is also a rw-conflict out.
  *
  * We will determine the top level xid of the writing transaction with which
  * we may be in conflict, and check for overlap with our own transaction.
  * If the transactions overlap (i.e., they cannot see each other's writes),
  * then we have a conflict out.
- *
- * This function should be called just about anywhere in heapam.c where a
- * tuple has been read. The caller must hold at least a shared lock on the
- * buffer, because this function might set hint bits on the tuple. There is
- * currently no known reason to call this function from an index AM.
  */
 void
-CheckForSerializableConflictOut(bool visible, Relation relation,
-								HeapTuple tuple, Buffer buffer,
-								Snapshot snapshot)
+CheckForSerializableConflictOut(Relation relation, TransactionId xid, Snapshot snapshot)
 {
-	TransactionId xid;
 	SERIALIZABLEXIDTAG sxidtag;
 	SERIALIZABLEXID *sxid;
 	SERIALIZABLEXACT *sxact;
-	HTSV_Result htsvResult;
 
 	if (!SerializationNeededForRead(relation, snapshot))
 		return;
@@ -4060,64 +4066,8 @@ CheckForSerializableConflictOut(bool visible, Relation relation,
 				 errdetail_internal("Reason code: Canceled on identification as a pivot, during conflict out checking."),
 				 errhint("The transaction might succeed if retried.")));
 	}
-
-	/*
-	 * Check to see whether the tuple has been written to by a concurrent
-	 * transaction, either to create it not visible to us, or to delete it
-	 * while it is visible to us.  The "visible" bool indicates whether the
-	 * tuple is visible to us, while HeapTupleSatisfiesVacuum checks what else
-	 * is going on with it.
-	 */
-	htsvResult = HeapTupleSatisfiesVacuum(tuple, TransactionXmin, buffer);
-	switch (htsvResult)
-	{
-		case HEAPTUPLE_LIVE:
-			if (visible)
-				return;
-			xid = HeapTupleHeaderGetXmin(tuple->t_data);
-			break;
-		case HEAPTUPLE_RECENTLY_DEAD:
-			if (!visible)
-				return;
-			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
-			break;
-		case HEAPTUPLE_DELETE_IN_PROGRESS:
-			xid = HeapTupleHeaderGetUpdateXid(tuple->t_data);
-			break;
-		case HEAPTUPLE_INSERT_IN_PROGRESS:
-			xid = HeapTupleHeaderGetXmin(tuple->t_data);
-			break;
-		case HEAPTUPLE_DEAD:
-			return;
-		default:
-
-			/*
-			 * The only way to get to this default clause is if a new value is
-			 * added to the enum type without adding it to this switch
-			 * statement.  That's a bug, so elog.
-			 */
-			elog(ERROR, "unrecognized return value from HeapTupleSatisfiesVacuum: %u", htsvResult);
-
-			/*
-			 * In spite of having all enum values covered and calling elog on
-			 * this default, some compilers think this is a code path which
-			 * allows xid to be used below without initialization. Silence
-			 * that warning.
-			 */
-			xid = InvalidTransactionId;
-	}
 	Assert(TransactionIdIsValid(xid));
-	Assert(TransactionIdFollowsOrEquals(xid, TransactionXmin));
 
-	/*
-	 * Find top level xid.  Bail out if xid is too early to be a conflict, or
-	 * if it's our own xid.
-	 */
-	if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
-		return;
-	xid = SubTransGetTopmostTransaction(xid);
-	if (TransactionIdPrecedes(xid, TransactionXmin))
-		return;
 	if (TransactionIdEquals(xid, GetTopTransactionIdIfAny()))
 		return;
 
@@ -4423,8 +4373,7 @@ CheckTargetForConflictsIn(PREDICATELOCKTARGETTAG *targettag)
  * tuple itself.
  */
 void
-CheckForSerializableConflictIn(Relation relation, HeapTuple tuple,
-							   Buffer buffer)
+CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber blkno)
 {
 	PREDICATELOCKTARGETTAG targettag;
 
@@ -4454,22 +4403,22 @@ CheckForSerializableConflictIn(Relation relation, HeapTuple tuple,
 	 * It is not possible to take and hold a lock across the checks for all
 	 * granularities because each target could be in a separate partition.
 	 */
-	if (tuple != NULL)
+	if (tid != NULL)
 	{
 		SET_PREDICATELOCKTARGETTAG_TUPLE(targettag,
 										 relation->rd_node.dbNode,
 										 relation->rd_id,
-										 ItemPointerGetBlockNumber(&(tuple->t_self)),
-										 ItemPointerGetOffsetNumber(&(tuple->t_self)));
+										 ItemPointerGetBlockNumber(tid),
+										 ItemPointerGetOffsetNumber(tid));
 		CheckTargetForConflictsIn(&targettag);
 	}
 
-	if (BufferIsValid(buffer))
+	if (blkno != InvalidBlockNumber)
 	{
 		SET_PREDICATELOCKTARGETTAG_PAGE(targettag,
 										relation->rd_node.dbNode,
 										relation->rd_id,
-										BufferGetBlockNumber(buffer));
+										blkno);
 		CheckTargetForConflictsIn(&targettag);
 	}
 
diff --git a/src/include/access/heapam.h b/src/include/access/heapam.h
index 858bcb6bc9..390023c0a5 100644
--- a/src/include/access/heapam.h
+++ b/src/include/access/heapam.h
@@ -217,5 +217,7 @@ extern bool ResolveCminCmaxDuringDecoding(struct HTAB *tuplecid_data,
 										  HeapTuple htup,
 										  Buffer buffer,
 										  CommandId *cmin, CommandId *cmax);
+extern void HeapCheckForSerializableConflictOut(bool valid, Relation relation, HeapTuple tuple,
+												Buffer buffer, Snapshot snapshot);
 
 #endif							/* HEAPAM_H */
diff --git a/src/include/storage/predicate.h b/src/include/storage/predicate.h
index 376245ecd7..7e1f67b8ed 100644
--- a/src/include/storage/predicate.h
+++ b/src/include/storage/predicate.h
@@ -57,16 +57,17 @@ extern void SetSerializableTransactionSnapshot(Snapshot snapshot,
 extern void RegisterPredicateLockingXid(TransactionId xid);
 extern void PredicateLockRelation(Relation relation, Snapshot snapshot);
 extern void PredicateLockPage(Relation relation, BlockNumber blkno, Snapshot snapshot);
-extern void PredicateLockTuple(Relation relation, HeapTuple tuple, Snapshot snapshot);
+extern void PredicateLockTID(Relation relation, ItemPointer tid, Snapshot snapshot,
+							 TransactionId insert_xid);
 extern void PredicateLockPageSplit(Relation relation, BlockNumber oldblkno, BlockNumber newblkno);
 extern void PredicateLockPageCombine(Relation relation, BlockNumber oldblkno, BlockNumber newblkno);
 extern void TransferPredicateLocksToHeapRelation(Relation relation);
 extern void ReleasePredicateLocks(bool isCommit, bool isReadOnlySafe);
 
 /* conflict detection (may also trigger rollback) */
-extern void CheckForSerializableConflictOut(bool valid, Relation relation, HeapTuple tuple,
-											Buffer buffer, Snapshot snapshot);
-extern void CheckForSerializableConflictIn(Relation relation, HeapTuple tuple, Buffer buffer);
+extern bool CheckForSerializableConflictOutNeeded(Relation relation, Snapshot snapshot);
+extern void CheckForSerializableConflictOut(Relation relation, TransactionId xid, Snapshot snapshot);
+extern void CheckForSerializableConflictIn(Relation relation, ItemPointer tid, BlockNumber blkno);
 extern void CheckTableForSerializableConflictIn(Relation relation);
 
 /* final rollback checking */
-- 
2.23.0

v4-0002-fixup.patchapplication/octet-stream; name=v4-0002-fixup.patchDownload
From 8c4ec236f84ad364a0f71416f871bd4871fe137f Mon Sep 17 00:00:00 2001
From: Thomas Munro <tmunro@postgresql.org>
Date: Mon, 11 Nov 2019 17:02:16 +1300
Subject: [PATCH v4 2/2] fixup

---
 src/backend/storage/lmgr/predicate.c | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/src/backend/storage/lmgr/predicate.c b/src/backend/storage/lmgr/predicate.c
index 812aa496f4..0c13057c19 100644
--- a/src/backend/storage/lmgr/predicate.c
+++ b/src/backend/storage/lmgr/predicate.c
@@ -4037,15 +4037,13 @@ CheckForSerializableConflictOutNeeded(Relation relation, Snapshot snapshot)
 
 /*
  * CheckForSerializableConflictOut
- *		We are reading a tuple which has been modified.  If it is visible to
- *		us but has been deleted, that indicates a rw-conflict out.  If it's
- *		not visible and was created by a concurrent (overlapping)
- *		serializable transaction, that is also a rw-conflict out.
+ *		A table AM is reading a tuple that has been modified.  After determining
+ *		that it is visible to us, it should call this function with the top
+ *		level xid of the writing transaction.
  *
- * We will determine the top level xid of the writing transaction with which
- * we may be in conflict, and check for overlap with our own transaction.
- * If the transactions overlap (i.e., they cannot see each other's writes),
- * then we have a conflict out.
+ * This function will check for overlap with our own transaction.  If the
+ * transactions overlap (i.e., they cannot see each other's writes), then we
+ * have a conflict out.
  */
 void
 CheckForSerializableConflictOut(Relation relation, TransactionId xid, Snapshot snapshot)
@@ -4054,8 +4052,8 @@ CheckForSerializableConflictOut(Relation relation, TransactionId xid, Snapshot s
 	SERIALIZABLEXID *sxid;
 	SERIALIZABLEXACT *sxact;
 
-	if (!SerializationNeededForRead(relation, snapshot))
-		return;
+	/* Calling code should have checked this. */
+	Assert(SerializationNeededForRead(relation, snapshot));
 
 	/* Check if someone else has already decided that we need to die */
 	if (SxactIsDoomed(MySerializableXact))
-- 
2.23.0

#23Ashwin Agrawal
aagrawal@pivotal.io
In reply to: Thomas Munro (#22)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

On Sun, Nov 10, 2019 at 8:21 PM Thomas Munro <thomas.munro@gmail.com> wrote:

I pushed the first two,

Thank You!

but on another read-through of the main patch

I didn't like the comments for CheckForSerializableConflictOut() or
the fact that it checks SerializationNeededForRead() again, after I
thought a bit about what the contract for this API is now. Here's a
version with small fixup that I'd like to squash into the patch.
Please let me know what you think,

The thought or reasoning behind having SerializationNeededForRead()
inside CheckForSerializableConflictOut() is to keep that API clean and
complete by itself. Only if AM like heap needs to perform some AM
specific checking only for serialization needed case can do so but is
not forced. So, if AM for example Zedstore doesn't need to do any
specific checking, then it can directly call
CheckForSerializableConflictOut(). With the modified fixup patch, the
responsibility is unnecessarily forced to caller even if
CheckForSerializableConflictOut() can do it. I understand the intent
is to avoid duplicate check for heap.

or if you see how to improve it
further.

I had proposed as alternative way in initial email and also later,
didn't receive comment on that, so re-posting.

Alternative way to refactor. CheckForSerializableConflictOut() can
take callback function and (void *) callback argument for AM specific
check. So, the flow would be AM calling
CheckForSerializableConflictOut() as today and only if
SerializationNeededForRead() will invoke the callback to check with AM
if more work should be performed or not. Essentially
HeapCheckForSerializableConflictOut() will become callback function
instead. So, roughly would look like....

typedef bool (*AMCheckForSerializableConflictOutCallback) (void *arg);

void CheckForSerializableConflictOut(Relation relation, TransactionId xid,
Snapshot snapshot, AMCheckForSerializableConflictOutCallback callback, void
*callback_arg)
{
if (!SerializationNeededForRead(relation, snapshot))
return;
if (callback != NULL && !callback(callback_args))
return;
........
.....
}

With this AMs which don't have any specific checks to perform can pass
callback as NULL. So, function call is involved only if
SerializationNeededForRead() and only for AMs which need it.

Just due to void* callback argument aspect I didn't prefer that
solution and felt AM performing checks and calling
CheckForSerializableConflictOut() seems better. Please let me know
how you feel about this.

#24Thomas Munro
thomas.munro@gmail.com
In reply to: Ashwin Agrawal (#23)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

On Wed, Nov 13, 2019 at 7:27 PM Ashwin Agrawal <aagrawal@pivotal.io> wrote:

On Sun, Nov 10, 2019 at 8:21 PM Thomas Munro <thomas.munro@gmail.com> wrote:

but on another read-through of the main patch
I didn't like the comments for CheckForSerializableConflictOut() or
the fact that it checks SerializationNeededForRead() again, after I
thought a bit about what the contract for this API is now. Here's a
version with small fixup that I'd like to squash into the patch.
Please let me know what you think,

The thought or reasoning behind having SerializationNeededForRead()
inside CheckForSerializableConflictOut() is to keep that API clean and
complete by itself. Only if AM like heap needs to perform some AM
specific checking only for serialization needed case can do so but is
not forced. So, if AM for example Zedstore doesn't need to do any
specific checking, then it can directly call
CheckForSerializableConflictOut(). With the modified fixup patch, the
responsibility is unnecessarily forced to caller even if
CheckForSerializableConflictOut() can do it. I understand the intent
is to avoid duplicate check for heap.

OK, I kept only the small comment change from that little fixup patch,
and pushed this.

I had proposed as alternative way in initial email and also later,
didn't receive comment on that, so re-posting.

typedef bool (*AMCheckForSerializableConflictOutCallback) (void *arg);

...

Just due to void* callback argument aspect I didn't prefer that
solution and felt AM performing checks and calling
CheckForSerializableConflictOut() seems better. Please let me know
how you feel about this.

Yeah. We could always come back to this idea if it looks better once
we have more experience with new table AMs.

#25Ashwin Agrawal
aagrawal@pivotal.io
In reply to: Thomas Munro (#24)
Re: Remove HeapTuple and Buffer dependency for predicate locking functions

On Mon, Jan 27, 2020 at 4:47 PM Thomas Munro <thomas.munro@gmail.com> wrote:

OK, I kept only the small comment change from that little fixup patch,
and pushed this.

I had proposed as alternative way in initial email and also later,
didn't receive comment on that, so re-posting.

typedef bool (*AMCheckForSerializableConflictOutCallback) (void *arg);

...

Just due to void* callback argument aspect I didn't prefer that
solution and felt AM performing checks and calling
CheckForSerializableConflictOut() seems better. Please let me know
how you feel about this.

Yeah. We could always come back to this idea if it looks better once
we have more experience with new table AMs.

Sounds good. Thank You!