Next step towards 64bit XIDs: Switch to FullTransactionId for PGPROC->xid and XLogRecord->xl_xid

Started by Maxim Orlovabout 2 years ago7 messages

orlovmg@gmail.com

about 2 years ago

2 attachment(s)

Hi!

As were discussed in [0]/messages/by-id/CACG=ezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe=pyyjVWA@mail.gmail.com our overall goal is to make Postgres 64 bit XIDs.
It's obvious, that such a big patch set
couldn't possible to commit "at once". SLUR patch set [1]/messages/by-id/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq+vfkmTF5Q@mail.gmail.com was committed a
short while ago as a first significant
step in this direction.

This thread is a next step in this enterprise. My objective here is to
commit some changes, which were mandatory,
as far as I understand, for any type of 64 XIDs implementation. And I'm
sure there will be points for discussion here.

My original intention was to make PGPROC->xmin, PGPROC->xid and
PROC_HDR->xids 64bit. But in reality,
it turned out to be much more difficult than I expected. On the one hand,
the patch became too big and on the other
hand, it's heavily relayed on epoch and XID "adjustment" to FXID. Therefore,
for now, I decided to limit myself to
more atomic and independent changes. However, as I said above, these
changes are required for any implementation
of 64bit XIDs.

So, PFA patches to make switch PGPROC->xid and XLogRecord->xl_xid to
FullTransactionId.

As always, any opinions are very welcome!

[0]: /messages/by-id/CACG=ezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe=pyyjVWA@mail.gmail.com
/messages/by-id/CACG=ezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe=pyyjVWA@mail.gmail.com
[1]: /messages/by-id/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq+vfkmTF5Q@mail.gmail.com
/messages/by-id/CAJ7c6TPDOYBYrnCAeyndkBktO0WG2xSdYduTF0nxq+vfkmTF5Q@mail.gmail.com

--
Best regards,
Maxim Orlov.

Attachments:

v1-0002-Switch-to-FullTransactionId-for-XLogRecord-xl_xid.patchapplication/octet-stream; name=v1-0002-Switch-to-FullTransactionId-for-XLogRecord-xl_xid.patchDownload

From b8054982cee3e51db593c9e446cceb6788d7ab13 Mon Sep 17 00:00:00 2001
From: Maxim Orlov <orlovmg@gmail.com>
Date: Fri, 29 Dec 2023 14:42:13 +0300
Subject: [PATCH v1 2/2] Switch to FullTransactionId for XLogRecord->xl_xid

As a step towards 64bit xids switch to FullTransactionId for XLogRecord->xl_xid.

Author: Maxim Orlov <orlovmg@gmail.com>
---
 src/backend/access/transam/xlog.c         |  2 +-
 src/backend/access/transam/xloginsert.c   |  2 +-
 src/backend/access/transam/xlogreader.c   | 17 +---------------
 src/backend/access/transam/xlogrecovery.c |  6 +++---
 src/bin/pg_resetwal/pg_resetwal.c         |  2 +-
 src/bin/pg_waldump/pg_waldump.c           |  2 +-
 src/include/access/xlogreader.h           |  2 +-
 src/include/access/xlogrecord.h           |  9 +++++----
 src/test/recovery/t/039_end_of_wal.pl     | 24 +++++++++++++----------
 9 files changed, 28 insertions(+), 38 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 1264849883..c38a39beda 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -4906,7 +4906,7 @@ BootStrapXLOG(void)
 	recptr = ((char *) page + SizeOfXLogLongPHD);
 	record = (XLogRecord *) recptr;
 	record->xl_prev = 0;
-	record->xl_xid = InvalidTransactionId;
+	record->xl_xid = InvalidFullTransactionId;
 	record->xl_tot_len = SizeOfXLogRecord + SizeOfXLogRecordDataHeaderShort + sizeof(checkPoint);
 	record->xl_info = XLOG_CHECKPOINT_SHUTDOWN;
 	record->xl_rmid = RM_XLOG_ID;
diff --git a/src/backend/access/transam/xloginsert.c b/src/backend/access/transam/xloginsert.c
index e4aaa551a0..8c25e43fb7 100644
--- a/src/backend/access/transam/xloginsert.c
+++ b/src/backend/access/transam/xloginsert.c
@@ -924,7 +924,7 @@ XLogRecordAssemble(RmgrId rmid, uint8 info,
 	 * once we know where in the WAL the record will be inserted. The CRC does
 	 * not include the record header yet.
 	 */
-	rechdr->xl_xid = GetCurrentTransactionIdIfAny();
+	rechdr->xl_xid = GetCurrentFullTransactionIdIfAny();
 	rechdr->xl_tot_len = (uint32) total_len;
 	rechdr->xl_info = info;
 	rechdr->xl_rmid = rmid;
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index 6b404b8169..e1251c8638 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -2171,28 +2171,13 @@ RestoreBlockImage(XLogReaderState *record, uint8 block_id, char *page)
 FullTransactionId
 XLogRecGetFullXid(XLogReaderState *record)
 {
-	TransactionId xid,
-				next_xid;
-	uint32		epoch;
-
 	/*
 	 * This function is only safe during replay, because it depends on the
 	 * replay state.  See AdvanceNextFullTransactionIdPastXid() for more.
 	 */
 	Assert(AmStartupProcess() || !IsUnderPostmaster);
 
-	xid = XLogRecGetXid(record);
-	next_xid = XidFromFullTransactionId(TransamVariables->nextXid);
-	epoch = EpochFromFullTransactionId(TransamVariables->nextXid);
-
-	/*
-	 * If xid is numerically greater than next_xid, it has to be from the last
-	 * epoch.
-	 */
-	if (unlikely(xid > next_xid))
-		--epoch;
-
-	return FullTransactionIdFromEpochAndXid(epoch, xid);
+	return record->record->header.xl_xid;
 }
 
 #endif
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c
index 6f4f81f992..dd3d770c3a 100644
--- a/src/backend/access/transam/xlogrecovery.c
+++ b/src/backend/access/transam/xlogrecovery.c
@@ -1875,7 +1875,7 @@ ApplyWalRecord(XLogReaderState *xlogreader, XLogRecord *record, TimeLineID *repl
 	/*
 	 * TransamVariables->nextXid must be beyond record's xid.
 	 */
-	AdvanceNextFullTransactionIdPastXid(record->xl_xid);
+	AdvanceNextFullTransactionIdPastXid(XidFromFullTransactionId(record->xl_xid));
 
 	/*
 	 * Before replaying this record, check if this record causes the current
@@ -1933,8 +1933,8 @@ ApplyWalRecord(XLogReaderState *xlogreader, XLogRecord *record, TimeLineID *repl
 	 * If we are attempting to enter Hot Standby mode, process XIDs we see
 	 */
 	if (standbyState >= STANDBY_INITIALIZED &&
-		TransactionIdIsValid(record->xl_xid))
-		RecordKnownAssignedTransactionIds(record->xl_xid);
+		FullTransactionIdIsValid(record->xl_xid))
+		RecordKnownAssignedTransactionIds(XidFromFullTransactionId(record->xl_xid));
 
 	/*
 	 * Some XLOG record types that are related to recovery are processed
diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index 5407f51a4e..43eb22d415 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -1105,7 +1105,7 @@ WriteEmptyXLOG(void)
 	recptr = (char *) page + SizeOfXLogLongPHD;
 	record = (XLogRecord *) recptr;
 	record->xl_prev = 0;
-	record->xl_xid = InvalidTransactionId;
+	record->xl_xid = InvalidFullTransactionId;
 	record->xl_tot_len = SizeOfXLogRecord + SizeOfXLogRecordDataHeaderShort + sizeof(CheckPoint);
 	record->xl_info = XLOG_CHECKPOINT_SHUTDOWN;
 	record->xl_rmid = RM_XLOG_ID;
diff --git a/src/bin/pg_waldump/pg_waldump.c b/src/bin/pg_waldump/pg_waldump.c
index a3535bdfa9..7200166282 100644
--- a/src/bin/pg_waldump/pg_waldump.c
+++ b/src/bin/pg_waldump/pg_waldump.c
@@ -1261,7 +1261,7 @@ main(int argc, char **argv)
 			continue;
 
 		if (config.filter_by_xid_enabled &&
-			config.filter_by_xid != record->xl_xid)
+			config.filter_by_xid != XidFromFullTransactionId(record->xl_xid))
 			continue;
 
 		/* check for extended filtering */
diff --git a/src/include/access/xlogreader.h b/src/include/access/xlogreader.h
index 0813722715..0de1cffd4f 100644
--- a/src/include/access/xlogreader.h
+++ b/src/include/access/xlogreader.h
@@ -409,7 +409,7 @@ extern bool DecodeXLogRecord(XLogReaderState *state,
 #define XLogRecGetPrev(decoder) ((decoder)->record->header.xl_prev)
 #define XLogRecGetInfo(decoder) ((decoder)->record->header.xl_info)
 #define XLogRecGetRmid(decoder) ((decoder)->record->header.xl_rmid)
-#define XLogRecGetXid(decoder) ((decoder)->record->header.xl_xid)
+#define XLogRecGetXid(decoder) ((uint32) ((decoder)->record->header.xl_xid.value))
 #define XLogRecGetOrigin(decoder) ((decoder)->record->record_origin)
 #define XLogRecGetTopXid(decoder) ((decoder)->record->toplevel_xid)
 #define XLogRecGetData(decoder) ((decoder)->record->main_data)
diff --git a/src/include/access/xlogrecord.h b/src/include/access/xlogrecord.h
index ec9a3c802a..99157af92a 100644
--- a/src/include/access/xlogrecord.h
+++ b/src/include/access/xlogrecord.h
@@ -12,6 +12,7 @@
 #define XLOGRECORD_H
 
 #include "access/rmgr.h"
+#include "access/transam.h"
 #include "access/xlogdefs.h"
 #include "port/pg_crc32c.h"
 #include "storage/block.h"
@@ -41,18 +42,18 @@
 typedef struct XLogRecord
 {
 	uint32		xl_tot_len;		/* total len of entire record */
-	TransactionId xl_xid;		/* xact id */
+	pg_crc32c	xl_crc;			/* CRC for this record */
+	FullTransactionId xl_xid;	/* xact id */
 	XLogRecPtr	xl_prev;		/* ptr to previous record in log */
 	uint8		xl_info;		/* flag bits, see below */
 	RmgrId		xl_rmid;		/* resource manager for this record */
-	/* 2 bytes of padding here, initialize to zero */
-	pg_crc32c	xl_crc;			/* CRC for this record */
+	/* 6 bytes of padding here, initialize to zero */
 
 	/* XLogRecordBlockHeaders and XLogRecordDataHeader follow, no padding */
 
 } XLogRecord;
 
-#define SizeOfXLogRecord	(offsetof(XLogRecord, xl_crc) + sizeof(pg_crc32c))
+#define SizeOfXLogRecord	(offsetof(XLogRecord, xl_rmid) + sizeof(RmgrId))
 
 /*
  * The high 4 bits in xl_info may be used freely by rmgr. The
diff --git a/src/test/recovery/t/039_end_of_wal.pl b/src/test/recovery/t/039_end_of_wal.pl
index d2bf062bb2..d4aca99995 100644
--- a/src/test/recovery/t/039_end_of_wal.pl
+++ b/src/test/recovery/t/039_end_of_wal.pl
@@ -21,7 +21,7 @@ use integer;    # causes / operator to use integer math
 my $BIG_ENDIAN = pack("L", 0x12345678) eq pack("N", 0x12345678);
 
 # Header size of record header.
-my $RECORD_HEADER_SIZE = 24;
+my $RECORD_HEADER_SIZE = 26;
 
 # Fields retrieved from code headers.
 my @scan_result = scan_server_header('access/xlog_internal.h',
@@ -131,17 +131,21 @@ sub build_record_header
 
 	# This needs to follow the structure XLogRecord:
 	# I for xl_tot_len
-	# I for xl_xid
+	# I for xl_crc
+	# II for xl_xid
 	# II for xl_prev
 	# C for xl_info
 	# C for xl_rmid
-	# BB for two bytes of padding
-	# I for xl_crc
-	return pack("IIIICCBBI",
-		$xl_tot_len, $xl_xid,
+	# BBBBBB for two bytes of padding
+	return pack("IIIIIICCBBBBBB",
+		$xl_tot_len,
+		$xl_crc,
+		$BIG_ENDIAN ? 0        : $xl_xid,
+		$BIG_ENDIAN ? $xl_xid  : 0,
 		$BIG_ENDIAN ? 0        : $xl_prev,
 		$BIG_ENDIAN ? $xl_prev : 0,
-		$xl_info, $xl_rmid, 0, 0, $xl_crc);
+		$xl_info, $xl_rmid,
+		0, 0, 0, 0, 0, 0);
 }
 
 # Build a fake WAL page header, based on the data given by the caller
@@ -265,7 +269,7 @@ $node->stop('immediate');
 my $log_size = -s $node->logfile;
 $node->start;
 ok( $node->log_contains(
-		"invalid record length at .*: expected at least 24, got 0", $log_size
+		"invalid record length at .*: expected at least 26, got 0", $log_size
 	),
 	"xl_tot_len zero");
 
@@ -277,7 +281,7 @@ write_wal($node, $TLI, $end_lsn, build_record_header(23));
 $log_size = -s $node->logfile;
 $node->start;
 ok( $node->log_contains(
-		"invalid record length at .*: expected at least 24, got 23",
+		"invalid record length at .*: expected at least 26, got 23",
 		$log_size),
 	"xl_tot_len short");
 
@@ -290,7 +294,7 @@ write_wal($node, $TLI, $end_lsn, build_record_header(1));
 $log_size = -s $node->logfile;
 $node->start;
 ok( $node->log_contains(
-		"invalid record length at .*: expected at least 24, got 1", $log_size
+		"invalid record length at .*: expected at least 26, got 1", $log_size
 	),
 	"xl_tot_len short at end-of-page");
 
-- 
2.42.0

v1-0001-Switch-to-FullTransactionId-for-PGPROC-xid.patchapplication/octet-stream; name=v1-0001-Switch-to-FullTransactionId-for-PGPROC-xid.patchDownload

From d53bea02c9465c30c50604324ab8b59e64ccc220 Mon Sep 17 00:00:00 2001
From: Maxim Orlov <orlovmg@gmail.com>
Date: Thu, 28 Dec 2023 18:45:19 +0300
Subject: [PATCH v1 1/2] Switch to FullTransactionId for PGPROC->xid

As a step towards 64bit xids switch to FullTransactionId for PGPROC->xid.

Author: Maxim Orlov <orlovmg@gmail.com>
---
 src/backend/access/transam/clog.c     |  2 +-
 src/backend/access/transam/twophase.c |  6 ++++--
 src/backend/access/transam/varsup.c   |  4 ++--
 src/backend/commands/indexcmds.c      |  2 +-
 src/backend/storage/ipc/procarray.c   | 25 +++++++++++++------------
 src/backend/storage/ipc/sinvaladt.c   |  2 +-
 src/backend/storage/lmgr/lock.c       |  4 ++--
 src/backend/storage/lmgr/proc.c       |  4 ++--
 src/include/access/transam.h          | 15 ++++++++++++++-
 src/include/storage/proc.h            |  2 +-
 10 files changed, 41 insertions(+), 25 deletions(-)

diff --git a/src/backend/access/transam/clog.c b/src/backend/access/transam/clog.c
index 7dca1df61b..96af8166ec 100644
--- a/src/backend/access/transam/clog.c
+++ b/src/backend/access/transam/clog.c
@@ -302,7 +302,7 @@ TransactionIdSetPageStatus(TransactionId xid, int nsubxids,
 	 * sub-XIDs and all of the XIDs for which we're adjusting clog should be
 	 * on the same page.  Check those conditions, too.
 	 */
-	if (all_xact_same_page && xid == MyProc->xid &&
+	if (all_xact_same_page && xid == XidFromFullTransactionId(MyProc->xid) &&
 		nsubxids <= THRESHOLD_SUBTRANS_CLOG_OPT &&
 		nsubxids == MyProc->subxidStatus.count &&
 		(nsubxids == 0 ||
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 11e1446cd1..4e3564580f 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -234,6 +234,8 @@ static void MarkAsPreparingGuts(GlobalTransaction gxact, TransactionId xid,
 static void RemoveTwoPhaseFile(TransactionId xid, bool giveWarning);
 static void RecreateTwoPhaseFile(TransactionId xid, void *content, int len);
 
+static inline FullTransactionId AdjustToFullTransactionId(TransactionId xid);
+
 /*
  * Initialization of shared memory
  */
@@ -477,7 +479,7 @@ MarkAsPreparingGuts(GlobalTransaction gxact, TransactionId xid, const char *gid,
 		proc->lxid = xid;
 		proc->backendId = InvalidBackendId;
 	}
-	proc->xid = xid;
+	proc->xid = AdjustToFullTransactionId(xid);
 	Assert(proc->xmin == InvalidTransactionId);
 	proc->delayChkptFlags = 0;
 	proc->statusFlags = 0;
@@ -793,7 +795,7 @@ pg_prepared_xact(PG_FUNCTION_ARGS)
 		 * Form tuple with appropriate data.
 		 */
 
-		values[0] = TransactionIdGetDatum(proc->xid);
+		values[0] = TransactionIdGetDatum(XidFromFullTransactionId(proc->xid));
 		values[1] = CStringGetTextDatum(gxact->gid);
 		values[2] = TimestampTzGetDatum(gxact->prepared_at);
 		values[3] = ObjectIdGetDatum(gxact->owner);
diff --git a/src/backend/access/transam/varsup.c b/src/backend/access/transam/varsup.c
index 77b18b8f28..b85fae9826 100644
--- a/src/backend/access/transam/varsup.c
+++ b/src/backend/access/transam/varsup.c
@@ -93,7 +93,7 @@ GetNewTransactionId(bool isSubXact)
 	if (IsBootstrapProcessingMode())
 	{
 		Assert(!isSubXact);
-		MyProc->xid = BootstrapTransactionId;
+		MyProc->xid = BootstrapFullTransactionId;
 		ProcGlobal->xids[MyProc->pgxactoff] = BootstrapTransactionId;
 		return FullTransactionIdFromEpochAndXid(0, BootstrapTransactionId);
 	}
@@ -255,7 +255,7 @@ GetNewTransactionId(bool isSubXact)
 		Assert(!MyProc->subxidStatus.overflowed);
 
 		/* LWLockRelease acts as barrier */
-		MyProc->xid = xid;
+		MyProc->xid = full_xid;
 		ProcGlobal->xids[MyProc->pgxactoff] = xid;
 	}
 	else
diff --git a/src/backend/commands/indexcmds.c b/src/backend/commands/indexcmds.c
index e56205abd8..a070afb0a7 100644
--- a/src/backend/commands/indexcmds.c
+++ b/src/backend/commands/indexcmds.c
@@ -4433,7 +4433,7 @@ set_indexsafe_procflags(void)
 	 * This should only be called before installing xid or xmin in MyProc;
 	 * otherwise, concurrent processes could see an Xmin that moves backwards.
 	 */
-	Assert(MyProc->xid == InvalidTransactionId &&
+	Assert(FullTransactionIdIsInvalid(MyProc->xid) &&
 		   MyProc->xmin == InvalidTransactionId);
 
 	LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index 976f7856fb..ac9198ae52 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -525,7 +525,7 @@ ProcArrayAdd(PGPROC *proc)
 
 	arrayP->pgprocnos[index] = proc->pgprocno;
 	proc->pgxactoff = index;
-	ProcGlobal->xids[index] = proc->xid;
+	ProcGlobal->xids[index] = XidFromFullTransactionId(proc->xid);
 	ProcGlobal->subxidStates[index] = proc->subxidStatus;
 	ProcGlobal->statusFlags[index] = proc->statusFlags;
 
@@ -674,7 +674,7 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
 		 * else is taking a snapshot.  See discussion in
 		 * src/backend/access/transam/README.
 		 */
-		Assert(TransactionIdIsValid(proc->xid));
+		Assert(FullTransactionIdIsValid(proc->xid));
 
 		/*
 		 * If we can immediately acquire ProcArrayLock, we clear our own XID
@@ -696,7 +696,7 @@ ProcArrayEndTransaction(PGPROC *proc, TransactionId latestXid)
 		 * anyone else's calculation of a snapshot.  We might change their
 		 * estimate of global xmin, but that's OK.
 		 */
-		Assert(!TransactionIdIsValid(proc->xid));
+		Assert(!TransactionIdIsValid(XidFromFullTransactionId(proc->xid)));
 		Assert(proc->subxidStatus.count == 0);
 		Assert(!proc->subxidStatus.overflowed);
 
@@ -738,10 +738,10 @@ ProcArrayEndTransactionInternal(PGPROC *proc, TransactionId latestXid)
 	 */
 	Assert(LWLockHeldByMeInMode(ProcArrayLock, LW_EXCLUSIVE));
 	Assert(TransactionIdIsValid(ProcGlobal->xids[pgxactoff]));
-	Assert(ProcGlobal->xids[pgxactoff] == proc->xid);
+	Assert(ProcGlobal->xids[pgxactoff] == XidFromFullTransactionId(proc->xid));
 
 	ProcGlobal->xids[pgxactoff] = InvalidTransactionId;
-	proc->xid = InvalidTransactionId;
+	proc->xid = InvalidFullTransactionId;
 	proc->lxid = InvalidLocalTransactionId;
 	proc->xmin = InvalidTransactionId;
 
@@ -796,7 +796,7 @@ ProcArrayGroupClearXid(PGPROC *proc, TransactionId latestXid)
 	uint32		wakeidx;
 
 	/* We should definitely have an XID to clear. */
-	Assert(TransactionIdIsValid(proc->xid));
+	Assert(FullTransactionIdIsValid(proc->xid));
 
 	/* Add ourselves to the list of processes needing a group XID clear. */
 	proc->procArrayGroupMember = true;
@@ -926,7 +926,7 @@ ProcArrayClearTransaction(PGPROC *proc)
 	pgxactoff = proc->pgxactoff;
 
 	ProcGlobal->xids[pgxactoff] = InvalidTransactionId;
-	proc->xid = InvalidTransactionId;
+	proc->xid = InvalidFullTransactionId;
 
 	proc->lxid = InvalidLocalTransactionId;
 	proc->xmin = InvalidTransactionId;
@@ -1734,7 +1734,8 @@ ComputeXidHorizons(ComputeXidHorizonsResult *h)
 	 * additions.
 	 */
 	{
-		TransactionId initial;
+		TransactionId	initial,
+						myproc_xid = XidFromFullTransactionId(MyProc->xid);
 
 		initial = XidFromFullTransactionId(h->latest_completed);
 		Assert(TransactionIdIsValid(initial));
@@ -1756,8 +1757,8 @@ ComputeXidHorizons(ComputeXidHorizonsResult *h)
 		 * definition, can't be any newer changes in the temp table than
 		 * latestCompletedXid.
 		 */
-		if (TransactionIdIsValid(MyProc->xid))
-			h->temp_oldest_nonremovable = MyProc->xid;
+		if (TransactionIdIsValid(myproc_xid))
+			h->temp_oldest_nonremovable = myproc_xid;
 		else
 			h->temp_oldest_nonremovable = initial;
 	}
@@ -2222,7 +2223,7 @@ GetSnapshotData(Snapshot snapshot)
 	latest_completed = TransamVariables->latestCompletedXid;
 	mypgxactoff = MyProc->pgxactoff;
 	myxid = other_xids[mypgxactoff];
-	Assert(myxid == MyProc->xid);
+	Assert(myxid == XidFromFullTransactionId(MyProc->xid));
 
 	oldestxid = TransamVariables->oldestXid;
 	curXactCompletionCount = TransamVariables->xactCompletionCount;
@@ -3484,7 +3485,7 @@ MinimumActiveBackends(int min)
 			continue;			/* do not count deleted entries */
 		if (proc == MyProc)
 			continue;			/* do not count myself */
-		if (proc->xid == InvalidTransactionId)
+		if (FullTransactionIdIsInvalid(proc->xid))
 			continue;			/* do not count if no XID assigned */
 		if (proc->pid == 0)
 			continue;			/* do not count prepared xacts */
diff --git a/src/backend/storage/ipc/sinvaladt.c b/src/backend/storage/ipc/sinvaladt.c
index 3d97c75bf1..3ae06bfa33 100644
--- a/src/backend/storage/ipc/sinvaladt.c
+++ b/src/backend/storage/ipc/sinvaladt.c
@@ -429,7 +429,7 @@ BackendIdGetTransactionIds(int backendID, TransactionId *xid,
 
 		if (proc != NULL)
 		{
-			*xid = proc->xid;
+			*xid = XidFromFullTransactionId(proc->xid);
 			*xmin = proc->xmin;
 			*nsubxid = proc->subxidStatus.count;
 			*overflowed = proc->subxidStatus.overflowed;
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index b8c57b3e16..8d988a3c17 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -3981,7 +3981,7 @@ GetRunningTransactionLocks(int *nlocks)
 		{
 			PGPROC	   *proc = proclock->tag.myProc;
 			LOCK	   *lock = proclock->tag.myLock;
-			TransactionId xid = proc->xid;
+			TransactionId xid = XidFromFullTransactionId(proc->xid);
 
 			/*
 			 * Don't record locks for transactions if we know they have
@@ -4601,7 +4601,7 @@ VirtualXactLock(VirtualTransactionId vxid, bool wait)
 	 * so we won't save an XID of a different VXID.  It doesn't matter whether
 	 * we save this before or after setting up the primary lock table entry.
 	 */
-	xid = proc->xid;
+	xid = XidFromFullTransactionId(proc->xid);
 
 	/* Done with proc->fpLockBits */
 	LWLockRelease(&proc->fpInfoLock);
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index b6451d9d08..e6e06d277b 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -377,7 +377,7 @@ InitProcess(void)
 	MyProc->lxid = InvalidLocalTransactionId;
 	MyProc->fpVXIDLock = false;
 	MyProc->fpLocalTransactionId = InvalidLocalTransactionId;
-	MyProc->xid = InvalidTransactionId;
+	MyProc->xid = InvalidFullTransactionId;
 	MyProc->xmin = InvalidTransactionId;
 	MyProc->pid = MyProcPid;
 	/* backendId, databaseId and roleId will be filled in later */
@@ -574,7 +574,7 @@ InitAuxiliaryProcess(void)
 	MyProc->lxid = InvalidLocalTransactionId;
 	MyProc->fpVXIDLock = false;
 	MyProc->fpLocalTransactionId = InvalidLocalTransactionId;
-	MyProc->xid = InvalidTransactionId;
+	MyProc->xid = InvalidFullTransactionId;
 	MyProc->xmin = InvalidTransactionId;
 	MyProc->backendId = InvalidBackendId;
 	MyProc->databaseId = InvalidOid;
diff --git a/src/include/access/transam.h b/src/include/access/transam.h
index eef2b2cdbe..2ed3040959 100644
--- a/src/include/access/transam.h
+++ b/src/include/access/transam.h
@@ -52,8 +52,8 @@
 #define FullTransactionIdPrecedesOrEquals(a, b) ((a).value <= (b).value)
 #define FullTransactionIdFollows(a, b) ((a).value > (b).value)
 #define FullTransactionIdFollowsOrEquals(a, b) ((a).value >= (b).value)
-#define FullTransactionIdIsValid(x)		TransactionIdIsValid(XidFromFullTransactionId(x))
 #define InvalidFullTransactionId		FullTransactionIdFromEpochAndXid(0, InvalidTransactionId)
+#define BootstrapFullTransactionId		FullTransactionIdFromEpochAndXid(0, BootstrapTransactionId)
 #define FirstNormalFullTransactionId	FullTransactionIdFromEpochAndXid(0, FirstNormalTransactionId)
 #define FullTransactionIdIsNormal(x)	FullTransactionIdFollowsOrEquals(x, FirstNormalFullTransactionId)
 
@@ -67,6 +67,19 @@ typedef struct FullTransactionId
 	uint64		value;
 } FullTransactionId;
 
+static inline bool
+FullTransactionIdIsValid(FullTransactionId fxid)
+{
+	/* XXX: why without epoch? Should it be fxid.value == InvalidTransactionId? */
+	return TransactionIdIsValid(XidFromFullTransactionId(fxid));
+}
+
+static inline bool
+FullTransactionIdIsInvalid(FullTransactionId fxid)
+{
+	return !FullTransactionIdIsValid(fxid);
+}
+
 static inline FullTransactionId
 FullTransactionIdFromEpochAndXid(uint32 epoch, TransactionId xid)
 {
diff --git a/src/include/storage/proc.h b/src/include/storage/proc.h
index e87fd25d64..bc40bae6d5 100644
--- a/src/include/storage/proc.h
+++ b/src/include/storage/proc.h
@@ -176,7 +176,7 @@ struct PGPROC
 	Latch		procLatch;		/* generic latch for process */
 
 
-	TransactionId xid;			/* id of top-level transaction currently being
+	FullTransactionId xid;		/* id of top-level transaction currently being
 								 * executed by this proc, if running and XID
 								 * is assigned; else InvalidTransactionId.
 								 * mirrored in ProcGlobal->xids[pgxactoff] */
-- 
2.42.0

Matthias van de Meent

boekewurm+postgres@gmail.com

about 2 years ago

In reply to: Maxim Orlov (#1)

Re: Next step towards 64bit XIDs: Switch to FullTransactionId for PGPROC->xid and XLogRecord->xl_xid

On Fri, 29 Dec 2023, 13:49 Maxim Orlov, <orlovmg@gmail.com> wrote:

Hi!

As were discussed in [0] our overall goal is to make Postgres 64 bit
XIDs. It's obvious, that such a big patch set
couldn't possible to commit "at once". SLUR patch set [1] was committed a
short while ago as a first significant
step in this direction.

This thread is a next step in this enterprise. My objective here is to
commit some changes, which were mandatory,
as far as I understand, for any type of 64 XIDs implementation. And I'm
sure there will be points for discussion here.

My original intention was to make PGPROC->xmin, PGPROC->xid and
PROC_HDR->xids 64bit. But in reality,
it turned out to be much more difficult than I expected. On the one hand,
the patch became too big and on the other
hand, it's heavily relayed on epoch and XID "adjustment" to FXID. Therefore,
for now, I decided to limit myself to
more atomic and independent changes. However, as I said above, these
changes are required for any implementation
of 64bit XIDs.

So, PFA patches to make switch PGPROC->xid

I think this could be fine, but ...

and XLogRecord->xl_xid to FullTransactionId.

I don't think this is an actionable change, as this wastes 4 more bytes (or
8 with alignment) in nearly all WAL records that don't use the
HEAP/HEAP2/XLOG rmgrs, which would then be up to 10 (if not 14, when
64but-aligned) bytes per record. Unless something like [0]https://commitfest.postgresql.org/46/4386/ gets committed
this will add a significant write overhead to all operations, even if they
are not doing anything that needs an XID.

Also, I don't think we need to support transactions that stay alive and
change things for longer than 2^31 concurrently created transactions, so we
could well add a fxid to each WAL segment header (and checkpoint record?)
and calculate the fxid of each record as a relative fxid off of that.

Kind regards

Matthias van de Meent

[0]: https://commitfest.postgresql.org/46/4386/

Michael Paquier

michael@paquier.xyz

about 2 years ago

In reply to: Matthias van de Meent (#2)

Re: Next step towards 64bit XIDs: Switch to FullTransactionId for PGPROC->xid and XLogRecord->xl_xid

On Fri, Dec 29, 2023 at 02:36:19PM +0100, Matthias van de Meent wrote:

I don't think this is an actionable change, as this wastes 4 more bytes (or
8 with alignment) in nearly all WAL records that don't use the
HEAP/HEAP2/XLOG rmgrs, which would then be up to 10 (if not 14, when
64but-aligned) bytes per record. Unless something like [0] gets committed
this will add a significant write overhead to all operations, even if they
are not doing anything that needs an XID.

I was eyeing at the patches before reading your comment, and saw this
across the two patches:

@@ -176,7 +176,7 @@ struct PGPROC
Latch procLatch; /* generic latch for process */

-	TransactionId xid;			/* id of top-level transaction currently being
+	FullTransactionId xid;		/* id of top-level transaction currently being
[...]
 typedef struct XLogRecord
 {
 	uint32		xl_tot_len;		/* total len of entire record */
-	TransactionId xl_xid;		/* xact id */
+	pg_crc32c	xl_crc;			/* CRC for this record */
+	FullTransactionId xl_xid;	/* xact id */

And FWIW, echoing with Matthias, making these generic structures
arbitrary larger is a non-starter. We should try to make them
shorter.
--
Michael

Maxim Orlov

orlovmg@gmail.com

about 2 years ago

In reply to: Michael Paquier (#3)

Re: Next step towards 64bit XIDs: Switch to FullTransactionId for PGPROC->xid and XLogRecord->xl_xid

On Fri, 29 Dec 2023 at 16:36, Matthias van de Meent <
boekewurm+postgres@gmail.com> wrote:

I don't think this is an actionable change, as this wastes 4 more bytes
(or 8 with alignment) in nearly all WAL records that don't use the
HEAP/HEAP2/XLOG rmgrs, which would then be up to 10 (if not 14, when
64but-aligned) bytes per record. Unless something like [0] gets committed
this will add a significant write overhead to all operations, even if they
are not doing anything that needs an XID.

Also, I don't think we need to support transactions that stay alive and
change things for longer than 2^31 concurrently created transactions, so we
could well add a fxid to each WAL segment header (and checkpoint record?)
and calculate the fxid of each record as a relative fxid off of that.

Thank you for your reply. Yes, this is a good idea. Actually, this is

exactly what I was trying to do at first.
But in this case, we have to add more locks on TransamVariables->nextXid,
and I've abandoned the idea.
Maybe, I should look in this direction.

On Sun, 31 Dec 2023 at 03:44, Michael Paquier <michael@paquier.xyz> wrote:

And FWIW, echoing with Matthias, making these generic structures
arbitrary larger is a non-starter. We should try to make them
shorter.

Yeah, obviously, this is patch make WAL bigger. I'll try to look into the
idea of fxid calculation, as mentioned above.
It might in part be a "chicken and the egg" situation. It is very hard to
split overall 64xid patch into smaller pieces
with each part been a meaningful and beneficial for current 32xids Postgres.

Anyway, thanks for reply, appreciate it.

--
Best regards,
Maxim Orlov.

Robert Haas

robertmhaas@gmail.com

about 2 years ago

In reply to: Maxim Orlov (#4)

Re: Next step towards 64bit XIDs: Switch to FullTransactionId for PGPROC->xid and XLogRecord->xl_xid

On Mon, Jan 1, 2024 at 1:15 AM Maxim Orlov <orlovmg@gmail.com> wrote:

Yeah, obviously, this is patch make WAL bigger. I'll try to look into the idea of fxid calculation, as mentioned above.
It might in part be a "chicken and the egg" situation. It is very hard to split overall 64xid patch into smaller pieces
with each part been a meaningful and beneficial for current 32xids Postgres.

I agree, but I think that's what we need to try to do. I mean, we
don't want to commit patches that individually make things *worse* on
the theory that when we do that enough times the result will be
overall better. And even if the individual patches seem to be entirely
neutral, that still doesn't offer great hope that the final result
will be an improvement.

Personally, I'm not convinced that "use 64-bit XIDs everywhere" is a
desirable goal. The space consumption issue here is just the tip of
the iceberg, and really exists in many places where we store XIDs. We
don't want to make tuple headers wider any more than we want to bloat
the WAL. We don't want to make the ProcArray bigger, either, not
because of memory consumption but because of the speed of memory
access. What we really want to do is fix the practical problems that
32-bit XIDs cause. AFAIK there are basically three such problems:

1. SLRUs that are indexed by XID can wrap around, or try to, leading
to confusion and bugs in the code and maybe errors when something
fills up.

2. If a table's relfrozenxid become older than ~2B, we have to stop
XID assignment until the problem is cured.

3. If a running transaction's XID age exceeds ~2B, we can't start new
transactions until the problem is cured.

There's already a patch for (1), I believe. I think we can cure that
by indexing SLRUs by 64-bit integers without changing how XIDs are
stored outside of SLRUs. Various people have attempted (2), for
example with an XID-per-page approach, but nothing's been committed
yet. We can't attempt to cure (3) until (1) or (2) are fixed, and I
think that would require using 64-bit XIDs in the ProcArray, but note
that (3) is less serious: (1) and (2) constrain the size of the XID
range that can legally appear on disk, whereas (3) only constraints
the size of the XID range that can legally occur in memory. That makes
a big difference, because reducing the size of the XID range that can
legally appear on disk requires vacuuming all tables with older
relfrozenxids which in the worst case takes time proportional to the
disk utilization of the instance, whereas reducing the size of the XID
range that can legally appear in memory just requires ending
transactions (including possibly prepared transactions) and removing
replication slots, which can be done in O(1) time.

Maybe this analysis I've just given isn't quite right, but my point is
that we should try to think hard about where in the system 32-bit XIDs
suck and for what reason, and use that as a guide to what to change
first.

--
Robert Haas
EDB: http://www.enterprisedb.com

Jim Nasby

jim.nasby@gmail.com

about 2 years ago

In reply to: Robert Haas (#5)

Re: Next step towards 64bit XIDs: Switch to FullTransactionId for PGPROC->xid and XLogRecord->xl_xid

<!DOCTYPE html>
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
</head>
<body>
<div class="moz-cite-prefix">On 1/2/24 1:58 PM, Robert Haas wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CA+TgmoYRg1MDF6a-QDrccVO=d2S37bjz54=xSzwen9FYHqw=7Q@mail.gmail.com">
<pre>Maybe this analysis I've just given isn't quite right, but my point is
that we should try to think hard about where in the system 32-bit XIDs
suck and for what reason, and use that as a guide to what to change
first.</pre>
</blockquote>
<p>Very much this. The biggest reason 2B XIDs are such an issue is
because it's incredibly expensive to move the window forward,
which is governed by on-disk stuff.<br>
</p>
</body>
</html>

Pavel Borisov

pashkin.elfe@gmail.com

about 2 years ago

In reply to: Maxim Orlov (#4)

Re: Next step towards 64bit XIDs: Switch to FullTransactionId for PGPROC->xid and XLogRecord->xl_xid

Hi, Maxim, and Happy New Year!

On Mon, 1 Jan 2024 at 10:15, Maxim Orlov <orlovmg@gmail.com> wrote:

On Fri, 29 Dec 2023 at 16:36, Matthias van de Meent <
boekewurm+postgres@gmail.com> wrote:

I don't think this is an actionable change, as this wastes 4 more bytes
(or 8 with alignment) in nearly all WAL records that don't use the
HEAP/HEAP2/XLOG rmgrs, which would then be up to 10 (if not 14, when
64but-aligned) bytes per record. Unless something like [0] gets committed
this will add a significant write overhead to all operations, even if they
are not doing anything that needs an XID.

Also, I don't think we need to support transactions that stay alive and
change things for longer than 2^31 concurrently created transactions, so we
could well add a fxid to each WAL segment header (and checkpoint record?)
and calculate the fxid of each record as a relative fxid off of that.

Thank you for your reply. Yes, this is a good idea. Actually, this is

exactly what I was trying to do at first.
But in this case, we have to add more locks on TransamVariables->nextXid,
and I've abandoned the idea.
Maybe, I should look in this direction.

On Sun, 31 Dec 2023 at 03:44, Michael Paquier <michael@paquier.xyz> wrote:

And FWIW, echoing with Matthias, making these generic structures
arbitrary larger is a non-starter. We should try to make them
shorter.

Yeah, obviously, this is patch make WAL bigger. I'll try to look into the
idea of fxid calculation, as mentioned above.
It might in part be a "chicken and the egg" situation. It is very hard to
split overall 64xid patch into smaller pieces
with each part been a meaningful and beneficial for current 32xids
Postgres.

The discussion quoted by you as OP [0]/messages/by-id/CACG=ezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe=pyyjVWA@mail.gmail.com shows that there is a wide range of
opinions on whether we need 64-bit XIDs and what is the better way to
implement it. We can also continue discussing implementation details in
that thread [0]/messages/by-id/CACG=ezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe=pyyjVWA@mail.gmail.com.

I agree that the step towards something big should not make things worse
just now. Does increasing the WAL header for several bytes make performance
change detectable at all? Maybe it's worth simulating a workload with a
large amount of WAL records and seeing how it works. I think it's regarding
this patchset only and could remove or substantiate the main questions
about the current patchset.

[0]: /messages/by-id/CACG=ezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe=pyyjVWA@mail.gmail.com
/messages/by-id/CACG=ezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe=pyyjVWA@mail.gmail.com

Kind regards,
Pavel Borisov.