From 5570c1b4840999d6b6dd8bae699de6742b18fe73 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <tomas.vondra@postgresql.org>
Date: Sun, 23 Jan 2022 01:00:29 +0100
Subject: [PATCH 2/2] Fix ordering of XIDs in ProcArrayApplyRecoveryInfo

Commit 8431e296ea reworked ProcArrayApplyRecoveryInfo to sort XIDs
before adding them to KnownAssignedXids, but it used xidComparator for
that purpose, which simply compares the values as uint32 values. This
may easily confuse KnownAssignedXidsAdd() which enforces the logical
ordering by calling TransactionIdFollowsOrEquals() etc.

Hitting this issue is fairly easy - you just need two transactons, one
started before the 4B limit (e.g. XID 4294967290), the other sometime
after it (e.g. XID 1000). Logically (4294967290 <= 1000) but when
compared using xidComparator we try to add them in the opposite order.
Which makes KnownAssignedXidsAdd() fail with an error like this:

  ERROR: out-of-order XID insertion in KnownAssignedXids

This however only happens during replica initialization - so you have to
restart (or create) a replica while there are such transactions with
contradictory XID ordering. Long-running transactions naturally increase
the likelihood of hitting this issue. Once the replica gets into this
state, it can't be started (even if the old transactions are terminated
in some way).

Fixed by sorting the XIDs logically - this is fine because we're dealing
with normal XIDs (because it's XIDs assigned to backends) and from the
same wraparound epoch (otherwise the backends could not be running at
the same time on the primary node). So there are no problems with the
triangle inequality, which is why xidComparator compares raw values.

Patch by me. Investigation and root cause analysis by Abhijit Menon-Sen.

This issue is present in all releases since 9.4, however releases up to
9.6 are EOL already so backpatch to 10 only.

Reviewed-by: Abhijit Menon-Sen
Reviewed-by: Alvaro Herrera
Backpatch-through: 10
---
 src/backend/storage/ipc/procarray.c |  7 ++++++-
 src/backend/utils/adt/xid.c         | 26 ++++++++++++++++++++++++++
 src/include/utils/builtins.h        |  1 +
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 3be60402890..9d3efb7d80e 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -1164,8 +1164,13 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
 		/*
 		 * Sort the array so that we can add them safely into
 		 * KnownAssignedXids.
+		 *
+		 * We have to sort them logically, because in KnownAssignedXidsAdd we
+		 * call TransactionIdFollowsOrEquals and so on. But we know these XIDs
+		 * come from RUNNING_XACTS, which means there are only normal XIDs from
+		 * the same epoch, so this is safe.
 		 */
-		qsort(xids, nxids, sizeof(TransactionId), xidComparator);
+		qsort(xids, nxids, sizeof(TransactionId), xidLogicalComparator);
 
 		/*
 		 * Add the sorted snapshot into KnownAssignedXids.  The running-xacts
diff --git a/src/backend/utils/adt/xid.c b/src/backend/utils/adt/xid.c
index e6633e9cffe..9b4ceaea47f 100644
--- a/src/backend/utils/adt/xid.c
+++ b/src/backend/utils/adt/xid.c
@@ -145,6 +145,32 @@ xidComparator(const void *arg1, const void *arg2)
 	return 0;
 }
 
+/*
+ * xidLogicalComparator
+ *		qsort comparison function for XIDs
+ *
+ * This is used to compare only XIDs from the same epoch (e.g. for backends
+ * running at the same time). So there must be only normal XIDs, so there's
+ * no issue with triangle inequality.
+ */
+int
+xidLogicalComparator(const void *arg1, const void *arg2)
+{
+	TransactionId xid1 = *(const TransactionId *) arg1;
+	TransactionId xid2 = *(const TransactionId *) arg2;
+
+	Assert(TransactionIdIsNormal(xid1));
+	Assert(TransactionIdIsNormal(xid2));
+
+	if (TransactionIdPrecedes(xid1, xid2))
+		return -1;
+
+	if (TransactionIdPrecedes(xid2, xid1))
+		return 1;
+
+	return 0;
+}
+
 Datum
 xid8toxid(PG_FUNCTION_ARGS)
 {
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index 7ac4780e3fc..d8f05a7df3a 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -87,6 +87,7 @@ extern void text_to_cstring_buffer(const text *src, char *dst, size_t dst_len);
 
 /* xid.c */
 extern int	xidComparator(const void *arg1, const void *arg2);
+extern int	xidLogicalComparator(const void *arg1, const void *arg2);
 
 /* inet_cidr_ntop.c */
 extern char *pg_inet_cidr_ntop(int af, const void *src, int bits,
-- 
2.31.1

