Missed opportunity for bsearch() in TransactionIdIsCurrentTransactionId()?

Started by Antonin Houskaover 1 year ago4 messages
#1Antonin Houska
ah@cybertec.at
1 attachment(s)

I don't quite understand why TransactionIdIsCurrentTransactionId() implements
binary search in ParallelCurrentXids "from scratch" instead of using
bsearch().

If I read the code correctly, the contents of the ParallelCurrentXids is
composed in SerializeTransactionState(), which uses xidComparator:

qsort(workspace, nxids, sizeof(TransactionId), xidComparator);

so it should be o.k. to use bsearch(..., xidComparator).

For example, ReorderBufferCopySnap() also uses xidComparator to sort the
'subxip' array, and HeapTupleSatisfiesHistoricMVCC() then uses
TransactionIdInArray() (which is effectively bsearch(..., xidComparator)) to
search for particular XID in the array.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

Attachments:

use_bsearch_for_ParallelCurrentXids.difftext/x-diffDownload
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index d119ab909d..8540e70e70 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -965,29 +965,9 @@ TransactionIdIsCurrentTransactionId(TransactionId xid)
 	 * the XIDs in this array are sorted numerically rather than according to
 	 * transactionIdPrecedes order.
 	 */
-	if (nParallelCurrentXids > 0)
-	{
-		int			low,
-					high;
-
-		low = 0;
-		high = nParallelCurrentXids - 1;
-		while (low <= high)
-		{
-			int			middle;
-			TransactionId probe;
-
-			middle = low + (high - low) / 2;
-			probe = ParallelCurrentXids[middle];
-			if (probe == xid)
-				return true;
-			else if (probe < xid)
-				low = middle + 1;
-			else
-				high = middle - 1;
-		}
-		return false;
-	}
+	if (bsearch(&xid, ParallelCurrentXids, nParallelCurrentXids,
+				sizeof(TransactionId), xidComparator))
+		return true;
 
 	/*
 	 * We will return true for the Xid of the current subtransaction, any of
#2Nathan Bossart
nathandbossart@gmail.com
In reply to: Antonin Houska (#1)
Re: Missed opportunity for bsearch() in TransactionIdIsCurrentTransactionId()?

On Wed, Jul 10, 2024 at 05:00:13PM +0200, Antonin Houska wrote:

I don't quite understand why TransactionIdIsCurrentTransactionId() implements
binary search in ParallelCurrentXids "from scratch" instead of using
bsearch().

I believe there are a number of open-coded binary searches in the tree. My
concern with switching them to bsearch() would be the performance impact of
using a function pointer for the comparisons. Perhaps we could add a
couple of inlined binary search implementations for common types to replace
many of the open-coded ones.

--
nathan

#3Antonin Houska
ah@cybertec.at
In reply to: Nathan Bossart (#2)
1 attachment(s)
Re: Missed opportunity for bsearch() in TransactionIdIsCurrentTransactionId()?

Nathan Bossart <nathandbossart@gmail.com> wrote:

On Wed, Jul 10, 2024 at 05:00:13PM +0200, Antonin Houska wrote:

I don't quite understand why TransactionIdIsCurrentTransactionId() implements
binary search in ParallelCurrentXids "from scratch" instead of using
bsearch().

I believe there are a number of open-coded binary searches in the tree.

Not sure if there are many, but I could find some:

* TransactionIdIsCurrentTransactionId()

* RelationHasSysCache()

* pg_dump.c:getRoleName()

My concern with switching them to bsearch() would be the performance impact
of using a function pointer for the comparisons. Perhaps we could add a
couple of inlined binary search implementations for common types to replace
many of the open-coded ones.

bsearch() appears to be used widely, but o.k., I don't insist on using it to
replace the existing open-coded searches.

What actually bothers me more than the absence of bsearch() is that
TransactionIdIsCurrentTransactionId() implements the binary search from
scratch. Even w/o bsearch(), it can still call TransactionIdInArray(). I ran
into the problem when working on [1]/messages/by-id/82651.1720540558@antos, which adds one more XID array.

Does the attached patch seem worth being applied separately, or at all?

[1]: /messages/by-id/82651.1720540558@antos

--
Antonin Houska
Web: https://www.cybertec-postgresql.com

Attachments:

0001-Refactor-search-in-a-sorted-XID-array.patchtext/x-diffDownload
From 470a66d0086bd8b656f88c86aec9f5861a763df7 Mon Sep 17 00:00:00 2001
From: Antonin Houska <ah@cybertec.at>
Date: Fri, 12 Jul 2024 10:06:32 +0200
Subject: [PATCH] Refactor search in a sorted XID array.

First, avoid duplicate definition of TransactionIdInArray().

Second, prefer open-coded implementation to bsearch(): it's probably cheaper
to compare two XIDs directly than via a bsearch() callback.
---
 src/backend/access/heap/heapam_visibility.c   | 10 ------
 src/backend/access/transam/xact.c             | 29 ++---------------
 .../replication/logical/reorderbuffer.c       | 11 -------
 src/include/access/transam.h                  | 32 +++++++++++++++++++
 4 files changed, 35 insertions(+), 47 deletions(-)

diff --git a/src/backend/access/heap/heapam_visibility.c b/src/backend/access/heap/heapam_visibility.c
index 9243feed01..d4653a28db 100644
--- a/src/backend/access/heap/heapam_visibility.c
+++ b/src/backend/access/heap/heapam_visibility.c
@@ -1559,16 +1559,6 @@ HeapTupleHeaderIsOnlyLocked(HeapTupleHeader tuple)
 	return true;
 }
 
-/*
- * check whether the transaction id 'xid' is in the pre-sorted array 'xip'.
- */
-static bool
-TransactionIdInArray(TransactionId xid, TransactionId *xip, Size num)
-{
-	return num > 0 &&
-		bsearch(&xid, xip, num, sizeof(TransactionId), xidComparator) != NULL;
-}
-
 /*
  * See the comments for HeapTupleSatisfiesMVCC for the semantics this function
  * obeys.
diff --git a/src/backend/access/transam/xact.c b/src/backend/access/transam/xact.c
index d119ab909d..795c04a879 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -961,33 +961,10 @@ TransactionIdIsCurrentTransactionId(TransactionId xid)
 
 	/*
 	 * In parallel workers, the XIDs we must consider as current are stored in
-	 * ParallelCurrentXids rather than the transaction-state stack.  Note that
-	 * the XIDs in this array are sorted numerically rather than according to
-	 * transactionIdPrecedes order.
+	 * ParallelCurrentXids rather than the transaction-state stack.
 	 */
-	if (nParallelCurrentXids > 0)
-	{
-		int			low,
-					high;
-
-		low = 0;
-		high = nParallelCurrentXids - 1;
-		while (low <= high)
-		{
-			int			middle;
-			TransactionId probe;
-
-			middle = low + (high - low) / 2;
-			probe = ParallelCurrentXids[middle];
-			if (probe == xid)
-				return true;
-			else if (probe < xid)
-				low = middle + 1;
-			else
-				high = middle - 1;
-		}
-		return false;
-	}
+	if (TransactionIdInArray(xid, ParallelCurrentXids, nParallelCurrentXids))
+		return true;
 
 	/*
 	 * We will return true for the Xid of the current subtransaction, any of
diff --git a/src/backend/replication/logical/reorderbuffer.c b/src/backend/replication/logical/reorderbuffer.c
index 00a8327e77..bd3edd0b47 100644
--- a/src/backend/replication/logical/reorderbuffer.c
+++ b/src/backend/replication/logical/reorderbuffer.c
@@ -5136,17 +5136,6 @@ ApplyLogicalMappingFile(HTAB *tuplecid_data, Oid relid, const char *fname)
 				 errmsg("could not close file \"%s\": %m", path)));
 }
 
-
-/*
- * Check whether the TransactionId 'xid' is in the pre-sorted array 'xip'.
- */
-static bool
-TransactionIdInArray(TransactionId xid, TransactionId *xip, Size num)
-{
-	return bsearch(&xid, xip, num,
-				   sizeof(TransactionId), xidComparator) != NULL;
-}
-
 /*
  * list_sort() comparator for sorting RewriteMappingFiles in LSN order.
  */
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index 28a2d287fd..78739b2490 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -370,6 +370,38 @@ FullTransactionIdNewer(FullTransactionId a, FullTransactionId b)
 	return b;
 }
 
+/*
+ * Check whether the transaction id 'xid' is in the pre-sorted array 'xip'.
+ *
+ * Note that the XIDs in this array must be sorted numerically rather than
+ * according to TransactionIdPrecedes order.
+ */
+static inline bool
+TransactionIdInArray(TransactionId xid, TransactionId *xip, Size num)
+{
+	int			low, high;
+
+	if (num == 0)
+		return false;
+
+	low = 0;
+	high = num - 1;
+	while (low <= high)
+	{
+		int			middle;
+		TransactionId probe;
+
+		middle = low + (high - low) / 2;
+		probe = xip[middle];
+		if (probe == xid)
+			return true;
+		else if (probe < xid)
+			low = middle + 1;
+		else
+			high = middle - 1;
+	}
+	return false;
+}
 #endif							/* FRONTEND */
 
 #endif							/* TRANSAM_H */
-- 
2.45.2

#4Nathan Bossart
nathandbossart@gmail.com
In reply to: Antonin Houska (#3)
Re: Missed opportunity for bsearch() in TransactionIdIsCurrentTransactionId()?

On Fri, Jul 12, 2024 at 12:01:11PM +0200, Antonin Houska wrote:

Nathan Bossart <nathandbossart@gmail.com> wrote:

My concern with switching them to bsearch() would be the performance impact
of using a function pointer for the comparisons. Perhaps we could add a
couple of inlined binary search implementations for common types to replace
many of the open-coded ones.

bsearch() appears to be used widely, but o.k., I don't insist on using it to
replace the existing open-coded searches.

Sorry, I didn't mean to say that I was totally against switching to
bsearch(), but I do think we need to understand whether there is any
performance impact before doing so, especially in hot paths. It would be
nice if we could reduce the number of open-coded binary searches in some
fashion, and if we can maintain or improve performance by creating a
handful of static inline functions, that would be even better. If there's
no apparent performance difference, bsearch() is probably fine.

--
nathan