mxid and mxoff wraparound issues in pg_upgrade

Started by Maxim Orlov7 months ago3 messages
#1Maxim Orlov
orlovmg@gmail.com
2 attachment(s)

Hi!

I've noticed two bugs reported [0]/messages/by-id/18863-72f08858855344a2@postgresql.org and [1]/messages/by-id/18865-d4c66cf35c2a67af@postgresql.org which are both related to the
wraparound of mxid and
mxoff. Problems for mxid and mxoff are minor, as they require hitting the
exact overflow limit to
occur. But it's better to correct it.

I included a test to reproduce the problem (see 0001). It is not intended
to be committed, I
guess. I then added a commit with a fix.

There two fixes.
1) pg_upgrade does not consider the mxid to be in a wraparound state. In
this case, I adjust
the mxid value to the FirstMultiXactId. Alternatively, it might be
conceivable to include the
ability to set InvalidMultiXactId in pg_resetwal, but this seems odd to
me.
2) pg_resetwall forbids to set mxoff to UINT_MAX32. I'm not sure if this
was done on
purpose or not, but perhaps this check can be removed.

Here is what I've got with the fix applied.

before upgrade:

Latest checkpoint's NextMultiXactId: 0
Latest checkpoint's NextMultiOffset: 4294967295
Latest checkpoint's oldestMultiXid: 4294967295

after upgrade:

Latest checkpoint's NextMultiXactId: 1
Latest checkpoint's NextMultiOffset: 4294967295
Latest checkpoint's oldestMultiXid: 4294967295

Thoughts?

[0]: /messages/by-id/18863-72f08858855344a2@postgresql.org
/messages/by-id/18863-72f08858855344a2@postgresql.org
[1]: /messages/by-id/18865-d4c66cf35c2a67af@postgresql.org
/messages/by-id/18865-d4c66cf35c2a67af@postgresql.org

--
Best regards,
Maxim Orlov.

Attachments:

v1-0001-Test-case-for-BUG-18863-and-BUG-18865.patchapplication/octet-stream; name=v1-0001-Test-case-for-BUG-18863-and-BUG-18865.patchDownload
From 2565a6b914e814063b1c0ed501c60fd8b5942f3d Mon Sep 17 00:00:00 2001
From: Maxim Orlov <orlovmg@gmail.com>
Date: Fri, 20 Jun 2025 17:20:59 +0300
Subject: [PATCH v1 1/2] Test case for BUG #18863 and BUG #18865

---
 src/bin/pg_upgrade/meson.build     |   1 +
 src/bin/pg_upgrade/t/007_mxwrap.pl | 111 +++++++++++++++++++++++++++++
 2 files changed, 112 insertions(+)
 create mode 100644 src/bin/pg_upgrade/t/007_mxwrap.pl

diff --git a/src/bin/pg_upgrade/meson.build b/src/bin/pg_upgrade/meson.build
index ac992f0d14..aff3c5ddb4 100644
--- a/src/bin/pg_upgrade/meson.build
+++ b/src/bin/pg_upgrade/meson.build
@@ -47,6 +47,7 @@ tests += {
       't/004_subscription.pl',
       't/005_char_signedness.pl',
       't/006_transfer_modes.pl',
+      't/007_mxwrap.pl',
     ],
     'test_kwargs': {'priority': 40}, # pg_upgrade tests are slow
   },
diff --git a/src/bin/pg_upgrade/t/007_mxwrap.pl b/src/bin/pg_upgrade/t/007_mxwrap.pl
new file mode 100644
index 0000000000..2cba978c18
--- /dev/null
+++ b/src/bin/pg_upgrade/t/007_mxwrap.pl
@@ -0,0 +1,111 @@
+#
+# Copyright (c) 2025, PostgreSQL Global Development Group
+#
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+use Test::More;
+
+sub print_controldata_info
+{
+	my $node = shift;
+	my ($stdout, $stderr) = run_command([ 'pg_controldata', $node->data_dir ]);
+
+	foreach (split("\n", $stdout))
+	{
+		if ($_ =~ /^Latest checkpoint's Next\s*(.*)$/mg or
+			$_ =~ /^Latest checkpoint's oldest\s*(.*)$/mg)
+		{
+			print $_."\n";
+		}
+	}
+}
+
+# 1) Create the new cluster - source_cluster
+my $oldnode = PostgreSQL::Test::Cluster->new('main');
+$oldnode->init;
+
+# 2) Reset mxid to the limit
+command_ok(
+	[ 'pg_resetwal', '-m', '4294967295,4294967295', $oldnode->data_dir ],
+	'approaching the mxid limit');
+command_ok(
+	[ 'pg_resetwal', '-O', '4294967293', $oldnode->data_dir ],
+	'approaching the mxoff limit');
+
+# 3) Create files in pg_multixact offsets and members
+my $offsets_seg = $oldnode->data_dir . '/pg_multixact/offsets/FFFF';
+open my $fh1, '>', $offsets_seg or BAIL_OUT($!);
+binmode $fh1;
+print $fh1 pack("x[262144]");
+close $fh1;
+
+my $members_seg = $oldnode->data_dir . '/pg_multixact/members/14078';
+open my $fh2, '>', $members_seg or BAIL_OUT($!);
+binmode $fh2;
+print $fh2 pack("x[262144]");
+close $fh2;
+
+# 4) Create the test table
+$oldnode->start;
+$oldnode->safe_psql('postgres',
+qq(
+	CREATE TABLE test_table (id integer NOT NULL PRIMARY KEY, val text);
+	INSERT INTO test_table VALUES (1, 'a');
+));
+
+# 5-9) Create a multitransaction
+my $conn1 = $oldnode->background_psql('postgres');
+my $conn2 = $oldnode->background_psql('postgres');
+
+$conn1->query_safe(qq(
+	BEGIN;
+	SELECT * FROM test_table WHERE id = 1 FOR SHARE;
+));
+$conn2->query_safe(qq(
+	BEGIN;
+	SELECT * FROM test_table WHERE id = 1 FOR SHARE;
+));
+
+$conn1->query_safe(qq(COMMIT;));
+$conn2->query_safe(qq(COMMIT;));
+
+$conn1->quit;
+$conn2->quit;
+
+$oldnode->stop;
+
+# 10) Run pg_upgrade
+my $newnode = PostgreSQL::Test::Cluster->new('new_node');
+$newnode->init;
+
+command_ok(
+	[
+		'pg_upgrade', '--no-sync', '-d', $oldnode->data_dir,
+		'-D', $newnode->data_dir, '-b', $oldnode->config_data('--bindir'),
+		'-B', $newnode->config_data('--bindir'), '-s', $newnode->host,
+		'-p', $oldnode->port, '-P', $newnode->port,
+		'--copy',
+	],
+	'run of pg_upgrade --check for new instance');
+ok(!-d $newnode->data_dir . "/pg_upgrade_output.d",
+	"pg_upgrade_output.d/ removed after pg_upgrade --check success");
+
+print ">>> OLD NODE: \n";
+print_controldata_info($oldnode);
+
+print "\n>>> NEW NODE: \n";
+print_controldata_info($newnode);
+
+# just in case...
+$newnode->start;
+$oldnode->start;
+is($newnode->safe_psql('postgres', qq(TABLE test_table;)),
+	$oldnode->safe_psql('postgres', qq(TABLE test_table;)),
+	'check invariant');
+$newnode->stop;
+$oldnode->stop;
+
+done_testing();
-- 
2.43.0

v1-0002-Fix-the-mxid-and-mxoff-wraparound-issues-in-pg_up.patchapplication/octet-stream; name=v1-0002-Fix-the-mxid-and-mxoff-wraparound-issues-in-pg_up.patchDownload
From dd5775184a729c2b5cfff40ff5ab55eed936d2fb Mon Sep 17 00:00:00 2001
From: Maxim Orlov <orlovmg@gmail.com>
Date: Fri, 20 Jun 2025 17:21:08 +0300
Subject: [PATCH v1 2/2] Fix the mxid and mxoff wraparound issues in
 pg_upgrade.

Per bags:
- BUG #18863: Multixact wraparound and pg_resetwal error "multitransaction ID (-m) must not be 0"
- BUG #18865: pg_resetwal error: multitransaction offset (-O) must not be -1
---
 src/bin/pg_resetwal/pg_resetwal.c |  5 +----
 src/bin/pg_upgrade/pg_upgrade.c   | 11 ++++++++++-
 2 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index e876f35f38..3b984eeae0 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -273,8 +273,6 @@ main(int argc, char *argv[])
 					pg_log_error_hint("Try \"%s --help\" for more information.", progname);
 					exit(1);
 				}
-				if (set_mxoff == -1)
-					pg_fatal("multitransaction offset (-O) must not be -1");
 				break;
 
 			case 'l':
@@ -463,8 +461,7 @@ main(int argc, char *argv[])
 		ControlFile.checkPointCopy.oldestMultiDB = InvalidOid;
 	}
 
-	if (set_mxoff != -1)
-		ControlFile.checkPointCopy.nextMultiOffset = set_mxoff;
+	ControlFile.checkPointCopy.nextMultiOffset = set_mxoff;
 
 	if (minXlogTli > ControlFile.checkPointCopy.ThisTimeLineID)
 	{
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 536e49d261..0bfca6865e 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -793,6 +793,15 @@ copy_xact_xlog_xid(void)
 	if (old_cluster.controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER &&
 		new_cluster.controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER)
 	{
+		uint32	old_chkpnt_nxtmulti = old_cluster.controldata.chkpnt_nxtmulti;
+
+		/*
+		 * Beware of the possibility that chkpnt_nxtmulti is in the
+		 * wrapped-around state in old cluster.
+		 */
+		if (old_chkpnt_nxtmulti == 0)
+			old_chkpnt_nxtmulti = 1;	/* FirstMultiXactId */
+
 		copy_subdir_files("pg_multixact/offsets", "pg_multixact/offsets");
 		copy_subdir_files("pg_multixact/members", "pg_multixact/members");
 
@@ -806,7 +815,7 @@ copy_xact_xlog_xid(void)
 				  "\"%s/pg_resetwal\" -O %u -m %u,%u \"%s\"",
 				  new_cluster.bindir,
 				  old_cluster.controldata.chkpnt_nxtmxoff,
-				  old_cluster.controldata.chkpnt_nxtmulti,
+				  old_chkpnt_nxtmulti,
 				  old_cluster.controldata.chkpnt_oldstMulti,
 				  new_cluster.pgdata);
 		check_ok();
-- 
2.43.0

#2Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Maxim Orlov (#1)
1 attachment(s)
Re: mxid and mxoff wraparound issues in pg_upgrade

On 20/06/2025 18:45, Maxim Orlov wrote:

Hi!

I've noticed two bugs reported [0] and [1] which are both related to the
wraparound of mxid and
mxoff. Problems for mxid and mxoff are minor, as they require hitting
the exact overflow limit to
occur. But it's better to correct it.

I included a test to reproduce the problem (see 0001). It is not
intended to be committed, I
guess. I then added a commit with a fix.

There two fixes.
1) pg_upgrade does not consider the mxid to be in a wraparound state. In
this case, I adjust
    the mxid value to the FirstMultiXactId. Alternatively, it might be
conceivable to include the
    ability to set InvalidMultiXactId in pg_resetwal, but this seems
odd to me.

I think we should allow InvalidMultiXactId in pg_resetwal. It can appear
in the control file of existing clusters, so it seems weird if you
cannot use pg_resetwal to reset to that value.

I also find it weird that we ever store InvalidMultiXactId in the
control file, and have code in all the places that read it to deal with
the possibility. It would seem much more straightforward to skip over
InvalidMultiXactId when we set 'nextMXact'. I think we should change
that in v19, but that's a separate patch.

2) pg_resetwall forbids to set mxoff to UINT_MAX32. I'm not sure if this
was done on
    purpose or not, but perhaps this check can be removed.

With your patch, pg_resetwal will *always* reset nextMultiOffset, even
if you didn't specify --multixact-offset.

I propose the attached fix. It adds separate booleans to pg_resetwal to
track whether -m or -O option was given, and allows 0 for the next
multixid and UINT32_MAX for next multixact offset.

Can oldestMultiXactId ever be 0? I think not, but then again
ReadMultiXactIdRange() seems to be prepared to deal with that.

- Heikki

Attachments:

v2-0001-Fix-the-mxid-and-mxoff-wraparound-issues-in-pg_up.patchtext/x-patch; charset=UTF-8; name=v2-0001-Fix-the-mxid-and-mxoff-wraparound-issues-in-pg_up.patchDownload
From 924c63daf1ceb9d796e18567afa6904cbe672a00 Mon Sep 17 00:00:00 2001
From: Maxim Orlov <orlovmg@gmail.com>
Date: Fri, 20 Jun 2025 17:21:08 +0300
Subject: [PATCH v2 1/1] Fix the mxid and mxoff wraparound issues in
 pg_upgrade.

Per bags:
- BUG #18863: Multixact wraparound and pg_resetwal error "multitransaction ID (-m) must not be 0"
- BUG #18865: pg_resetwal error: multitransaction offset (-O) must not be -1
---
 src/bin/pg_resetwal/pg_resetwal.c | 18 +++++++++---------
 src/bin/pg_upgrade/pg_upgrade.c   | 11 ++++++++++-
 2 files changed, 19 insertions(+), 10 deletions(-)

diff --git a/src/bin/pg_resetwal/pg_resetwal.c b/src/bin/pg_resetwal/pg_resetwal.c
index a89d72fc5cf..3d61f025c5e 100644
--- a/src/bin/pg_resetwal/pg_resetwal.c
+++ b/src/bin/pg_resetwal/pg_resetwal.c
@@ -70,8 +70,10 @@ static TransactionId set_xid = 0;
 static TransactionId set_oldest_commit_ts_xid = 0;
 static TransactionId set_newest_commit_ts_xid = 0;
 static Oid	set_oid = 0;
+static bool mxid_given = false;
 static MultiXactId set_mxid = 0;
-static MultiXactOffset set_mxoff = (MultiXactOffset) -1;
+static bool mxoff_given = false;
+static MultiXactOffset set_mxoff = 0;
 static TimeLineID minXlogTli = 0;
 static XLogSegNo minXlogSegNo = 0;
 static int	WalSegSz;
@@ -246,6 +248,7 @@ main(int argc, char *argv[])
 					pg_log_error_hint("Try \"%s --help\" for more information.", progname);
 					exit(1);
 				}
+				mxid_given = true;
 
 				set_oldestmxid = strtoul(endptr + 1, &endptr2, 0);
 				if (endptr2 == endptr + 1 || *endptr2 != '\0' || errno != 0)
@@ -254,8 +257,6 @@ main(int argc, char *argv[])
 					pg_log_error_hint("Try \"%s --help\" for more information.", progname);
 					exit(1);
 				}
-				if (set_mxid == 0)
-					pg_fatal("multitransaction ID (-m) must not be 0");
 
 				/*
 				 * XXX It'd be nice to have more sanity checks here, e.g. so
@@ -274,8 +275,7 @@ main(int argc, char *argv[])
 					pg_log_error_hint("Try \"%s --help\" for more information.", progname);
 					exit(1);
 				}
-				if (set_mxoff == -1)
-					pg_fatal("multitransaction offset (-O) must not be -1");
+				mxoff_given = true;
 				break;
 
 			case 'l':
@@ -454,7 +454,7 @@ main(int argc, char *argv[])
 	if (set_oid != 0)
 		ControlFile.checkPointCopy.nextOid = set_oid;
 
-	if (set_mxid != 0)
+	if (mxid_given)
 	{
 		ControlFile.checkPointCopy.nextMulti = set_mxid;
 
@@ -464,7 +464,7 @@ main(int argc, char *argv[])
 		ControlFile.checkPointCopy.oldestMultiDB = InvalidOid;
 	}
 
-	if (set_mxoff != -1)
+	if (mxoff_given)
 		ControlFile.checkPointCopy.nextMultiOffset = set_mxoff;
 
 	if (minXlogTli > ControlFile.checkPointCopy.ThisTimeLineID)
@@ -805,7 +805,7 @@ PrintNewControlValues(void)
 				 newXlogSegNo, WalSegSz);
 	printf(_("First log segment after reset:        %s\n"), fname);
 
-	if (set_mxid != 0)
+	if (mxid_given)
 	{
 		printf(_("NextMultiXactId:                      %u\n"),
 			   ControlFile.checkPointCopy.nextMulti);
@@ -815,7 +815,7 @@ PrintNewControlValues(void)
 			   ControlFile.checkPointCopy.oldestMultiDB);
 	}
 
-	if (set_mxoff != -1)
+	if (mxoff_given)
 	{
 		printf(_("NextMultiOffset:                      %u\n"),
 			   ControlFile.checkPointCopy.nextMultiOffset);
diff --git a/src/bin/pg_upgrade/pg_upgrade.c b/src/bin/pg_upgrade/pg_upgrade.c
index 490e98fa26f..2a317e59108 100644
--- a/src/bin/pg_upgrade/pg_upgrade.c
+++ b/src/bin/pg_upgrade/pg_upgrade.c
@@ -816,6 +816,15 @@ copy_xact_xlog_xid(void)
 	if (old_cluster.controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER &&
 		new_cluster.controldata.cat_ver >= MULTIXACT_FORMATCHANGE_CAT_VER)
 	{
+		uint32	old_chkpnt_nxtmulti = old_cluster.controldata.chkpnt_nxtmulti;
+
+		/*
+		 * Beware of the possibility that chkpnt_nxtmulti is in the
+		 * wrapped-around state in old cluster.
+		 */
+		if (old_chkpnt_nxtmulti == 0)
+			old_chkpnt_nxtmulti = 1;	/* FirstMultiXactId */
+
 		copy_subdir_files("pg_multixact/offsets", "pg_multixact/offsets");
 		copy_subdir_files("pg_multixact/members", "pg_multixact/members");
 
@@ -829,7 +838,7 @@ copy_xact_xlog_xid(void)
 				  "\"%s/pg_resetwal\" -O %u -m %u,%u \"%s\"",
 				  new_cluster.bindir,
 				  old_cluster.controldata.chkpnt_nxtmxoff,
-				  old_cluster.controldata.chkpnt_nxtmulti,
+				  old_chkpnt_nxtmulti,
 				  old_cluster.controldata.chkpnt_oldstMulti,
 				  new_cluster.pgdata);
 		check_ok();
-- 
2.47.3

#3Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Heikki Linnakangas (#2)
Re: mxid and mxoff wraparound issues in pg_upgrade

On 05/11/2025 14:15, Heikki Linnakangas wrote:

I propose the attached fix. It adds separate booleans to pg_resetwal to
track whether -m or -O option was given, and allows 0 for the next
multixid and UINT32_MAX for next multixact offset.

Committed, with one more change:

I replaced strtoul() with strtoi64() in parsing the multixact offset.
strtoul() silently casts negative values to positive ones, which means
that even before this patch, we accepted any negative value, except for
-1 which was checked specifically. That's bogus, and with the previous
patch I posted, we would start accepting -1 too, which makes it even
worse. By using strtoi64() and converting to uint32 explicitly, we now
check and error out on all negative values.

We have the same issue with all the other flags that use strtoul(), but
I didn't want to touch those right now, especially in back-branches. But
I also didn't want to start accepting -1 for -O. On 'master' I think it
would make sense to tighten up the other flags, too. I don't plan to
work on that myself but would be happy to review if someone writes the
patch.

I didn't commit the new pg_upgrade test case, but many thanks for
providing it. It was very helpful in demonstrating the problem.

- Heikki