pg_upgrade: Support for upgrading to checksums enabled

Started by Peter Eisentrautover 1 year ago7 messages
#1Peter Eisentraut
peter@eisentraut.org
1 attachment(s)

I'm reposting this patch in a separate thread so I can make a separate
commitfest entry for it. The previous discussion is mixed in with [0]/messages/by-id/CAKAnmmKwiMHik5AHmBEdf5vqzbOBbcwEPHo4-PioWeAbzwcTOQ@mail.gmail.com.

The purpose of this patch is to allow using pg_upgrade between clusters
that have different checksum settings. When upgrading between instances
with different checksum settings, the --copy (default) mode
automatically sets (or unsets) the checksum on the fly.

This would be particularly useful if we switched to enabling checksums
by default, as [0]/messages/by-id/CAKAnmmKwiMHik5AHmBEdf5vqzbOBbcwEPHo4-PioWeAbzwcTOQ@mail.gmail.com proposes, but it's also useful without that.

Some discussion points:

- We have added a bunch of different transfer modes to pg_upgrade in
order to give the user control over the expected file transfer
performance. Here, I have added this checksum rewriting to the existing
--copy mode, and I have measured about a 5% overhead. An alternative
would be to make this an explicit mode (--copy-slow,
--copy-and-make-adjustments).

- Windows has a separate code path in the --copy mode. I don't know the
reasons or advantages of that. So it's not clear how the checksum
rewriting mode should be handled in that case. We could switch to the
non-Windows code path in that case, but then the performance difference
between the normal path and the checksum-rewriting path is even more
unclear.

[0]: /messages/by-id/CAKAnmmKwiMHik5AHmBEdf5vqzbOBbcwEPHo4-PioWeAbzwcTOQ@mail.gmail.com
/messages/by-id/CAKAnmmKwiMHik5AHmBEdf5vqzbOBbcwEPHo4-PioWeAbzwcTOQ@mail.gmail.com

Attachments:

v1-0001-pg_upgrade-Support-for-upgrading-to-checksums-ena.patchtext/plain; charset=UTF-8; name=v1-0001-pg_upgrade-Support-for-upgrading-to-checksums-ena.patchDownload
From a050f2d857b2e5321b9e1110f6a41185d91c51ee Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Thu, 22 Aug 2024 07:56:04 +0200
Subject: [PATCH v1] pg_upgrade: Support for upgrading to checksums enabled

When upgrading between instances with different checksum settings, the
--copy (default) mode automatically sets (or unsets) the checksum on
the fly.

TODO: What to do about the Windows code path?
---
 src/bin/pg_upgrade/controldata.c   | 17 +++++-----------
 src/bin/pg_upgrade/file.c          | 31 +++++++++++++++++++++++++++++-
 src/bin/pg_upgrade/pg_upgrade.h    |  3 ++-
 src/bin/pg_upgrade/relfilenumber.c |  2 +-
 4 files changed, 38 insertions(+), 15 deletions(-)

diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index 854c6887a23..583cbf109cf 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -693,18 +693,11 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
-	/*
-	 * We might eventually allow upgrades from checksum to no-checksum
-	 * clusters.
-	 */
-	if (oldctrl->data_checksum_version == 0 &&
-		newctrl->data_checksum_version != 0)
-		pg_fatal("old cluster does not use data checksums but the new one does");
-	else if (oldctrl->data_checksum_version != 0 &&
-			 newctrl->data_checksum_version == 0)
-		pg_fatal("old cluster uses data checksums but the new one does not");
-	else if (oldctrl->data_checksum_version != newctrl->data_checksum_version)
-		pg_fatal("old and new cluster pg_controldata checksum versions do not match");
+	if (oldctrl->data_checksum_version != newctrl->data_checksum_version)
+	{
+		if (user_opts.transfer_mode != TRANSFER_MODE_COPY)
+			pg_fatal("when upgrading between clusters with different data checksum settings, transfer mode --copy must be used");
+	}
 }
 
 
diff --git a/src/bin/pg_upgrade/file.c b/src/bin/pg_upgrade/file.c
index 73932504cae..0a16ad79237 100644
--- a/src/bin/pg_upgrade/file.c
+++ b/src/bin/pg_upgrade/file.c
@@ -80,12 +80,14 @@ cloneFile(const char *src, const char *dst,
  */
 void
 copyFile(const char *src, const char *dst,
-		 const char *schemaName, const char *relName)
+		 const char *schemaName, const char *relName,
+		 int segno)
 {
 #ifndef WIN32
 	int			src_fd;
 	int			dest_fd;
 	char	   *buffer;
+	BlockNumber blocknum;
 
 	if ((src_fd = open(src, O_RDONLY | PG_BINARY, 0)) < 0)
 		pg_fatal("error while copying relation \"%s.%s\": could not open file \"%s\": %m",
@@ -101,6 +103,8 @@ copyFile(const char *src, const char *dst,
 
 	buffer = (char *) pg_malloc(COPY_BUF_SIZE);
 
+	blocknum = segno * old_cluster.controldata.largesz;
+
 	/* perform data copying i.e read src source, write to destination */
 	while (true)
 	{
@@ -113,6 +117,31 @@ copyFile(const char *src, const char *dst,
 		if (nbytes == 0)
 			break;
 
+		/*
+		 * Update checksums if upgrading between different settings
+		 */
+		if (old_cluster.controldata.data_checksum_version != new_cluster.controldata.data_checksum_version)
+		{
+			/* must deal in whole blocks */
+			nbytes = nbytes / BLCKSZ * BLCKSZ;
+
+			for (int i = 0; i * BLCKSZ < nbytes; i++)
+			{
+				char	   *page = buffer + i * BLCKSZ;
+				PageHeader	header = (PageHeader) page;
+
+				if (PageIsNew(page))
+					continue;
+
+				if (new_cluster.controldata.data_checksum_version == 1)
+					header->pd_checksum = pg_checksum_page(page, blocknum);
+				else
+					header->pd_checksum = 0;
+
+				blocknum++;
+			}
+		}
+
 		errno = 0;
 		if (write(dest_fd, buffer, nbytes) != nbytes)
 		{
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index cdb6e2b7597..c6a4ade53a7 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -404,7 +404,8 @@ bool		pid_lock_file_exists(const char *datadir);
 void		cloneFile(const char *src, const char *dst,
 					  const char *schemaName, const char *relName);
 void		copyFile(const char *src, const char *dst,
-					 const char *schemaName, const char *relName);
+					 const char *schemaName, const char *relName,
+					 int segno);
 void		copyFileByRange(const char *src, const char *dst,
 							const char *schemaName, const char *relName);
 void		linkFile(const char *src, const char *dst,
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index 1d3054d78bd..2ae69d71159 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -250,7 +250,7 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
 				case TRANSFER_MODE_COPY:
 					pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\"",
 						   old_file, new_file);
-					copyFile(old_file, new_file, map->nspname, map->relname);
+					copyFile(old_file, new_file, map->nspname, map->relname, segno);
 					break;
 				case TRANSFER_MODE_COPY_FILE_RANGE:
 					pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\" with copy_file_range",

base-commit: 9bb842f95ef3384f0822c386a4c569780e613e4e
-- 
2.46.0

#2Ilya Kosmodemiansky
ik@dataegret.com
In reply to: Peter Eisentraut (#1)
Re: pg_upgrade: Support for upgrading to checksums enabled

Hi Peter,

I've applied and tested your patch, it works at least on MacOS with
meson build. A couple of thoughts about this patch inline below.

On Mon, Aug 26, 2024 at 8:23 AM Peter Eisentraut <peter@eisentraut.org> wrote:

The purpose of this patch is to allow using pg_upgrade between clusters
that have different checksum settings. When upgrading between instances
with different checksum settings, the --copy (default) mode
automatically sets (or unsets) the checksum on the fly.

I think the entire idea of this patch is great because it allows us to
remove an additional step in upgrade procedure, i.e. enabling
checksums before upgrade. A part about which I am not quite sure, is
"automatically". It is sufficient in most cases, but maybe also to
have an explicit flag would be a nice option as well.

in the patch itself:

-        * We might eventually allow upgrades from checksum to no-checksum
-        * clusters.
-        */
-       if (oldctrl->data_checksum_version == 0 &&
-               newctrl->data_checksum_version != 0)
-               pg_fatal("old cluster does not use data checksums but the new one does");
-       else if (oldctrl->data_checksum_version != 0 &&
-                        newctrl->data_checksum_version == 0)
-               pg_fatal("old cluster uses data checksums but the new one does not");
-       else if (oldctrl->data_checksum_version != newctrl->data_checksum_version)
-               pg_fatal("old and new cluster pg_controldata checksum versions do not match");
+       if (oldctrl->data_checksum_version != newctrl->data_checksum_version)
+       {
+               if (user_opts.transfer_mode != TRANSFER_MODE_COPY)
+                       pg_fatal("when upgrading between clusters with different data checksum settings, transfer mode --copy must be used");
+       }

I've tried to recall when I see the previous error message "old and
new cluster pg_controldata checksum versions do not match" at most. It
was almost always pg_upgrade with --link option, because we mostly use
pg_upgrade when copy is barely an option. Previous error message gave
a clear statement: go one step behind and enable checksums. The new
one gives imo a wrong message: "your only option now is to use copy".

Would it be better to polish wording in direction "when upgrading
between clusters with different data checksum settings, transfer mode
--copy must be used to enable checksum automatically" or "when
upgrading between clusters with different data checksum settings,
transfer mode --copy must be used or data checksum settings of the old
cluster must be changed manually before upgrade"?

Some discussion points:

- We have added a bunch of different transfer modes to pg_upgrade in
order to give the user control over the expected file transfer
performance. Here, I have added this checksum rewriting to the existing
--copy mode, and I have measured about a 5% overhead. An alternative
would be to make this an explicit mode (--copy-slow,
--copy-and-make-adjustments).

Maybe a separate -k flag to enable this behavior explicitly?

best regards,
Ilya

--
Ilya Kosmodemiansky
CEO, Founder

Data Egret GmbH
Your remote PostgreSQL DBA team
T.: +49 6821 919 3297
ik@dataegret.com

#3Nathan Bossart
nathandbossart@gmail.com
In reply to: Peter Eisentraut (#1)
Re: pg_upgrade: Support for upgrading to checksums enabled

On Mon, Aug 26, 2024 at 08:23:44AM +0200, Peter Eisentraut wrote:

The purpose of this patch is to allow using pg_upgrade between clusters that
have different checksum settings. When upgrading between instances with
different checksum settings, the --copy (default) mode automatically sets
(or unsets) the checksum on the fly.

This would be particularly useful if we switched to enabling checksums by
default, as [0] proposes, but it's also useful without that.

Given enabling checksums can be rather expensive, I think it makes sense to
add a way to do it during pg_upgrade versus asking folks to run
pg_checksums separately. I'd anticipate arguments against enabling
checksums automatically, but as you noted, we can move it to a separate
option (e.g., --copy --enable-checksums). Disabling checksums with
pg_checksums is fast because it just updates pg_control, so I don't see any
need for --disable-checkums in pg_upgrade.

- Windows has a separate code path in the --copy mode. I don't know the
reasons or advantages of that. So it's not clear how the checksum rewriting
mode should be handled in that case. We could switch to the non-Windows
code path in that case, but then the performance difference between the
normal path and the checksum-rewriting path is even more unclear.

AFAICT the separate Windows path dates back to before pg_upgrade was first
added to the Postgres tree, and unfortunately I couldn't find any
discussion about it.

--
nathan

#4Robert Treat
rob@xzilla.net
In reply to: Nathan Bossart (#3)
Re: pg_upgrade: Support for upgrading to checksums enabled

On Tue, Aug 27, 2024 at 5:57 PM Nathan Bossart <nathandbossart@gmail.com> wrote:

On Mon, Aug 26, 2024 at 08:23:44AM +0200, Peter Eisentraut wrote:

The purpose of this patch is to allow using pg_upgrade between clusters that
have different checksum settings. When upgrading between instances with
different checksum settings, the --copy (default) mode automatically sets
(or unsets) the checksum on the fly.

This would be particularly useful if we switched to enabling checksums by
default, as [0] proposes, but it's also useful without that.

Given enabling checksums can be rather expensive, I think it makes sense to
add a way to do it during pg_upgrade versus asking folks to run
pg_checksums separately. I'd anticipate arguments against enabling
checksums automatically, but as you noted, we can move it to a separate
option (e.g., --copy --enable-checksums). Disabling checksums with
pg_checksums is fast because it just updates pg_control, so I don't see any
need for --disable-checkums in pg_upgrade.

The repercussions of either enabling or disabling checksums on
accident is quite high (not for pg_upgrade, but for $futureDBA), so
ISTM an explicit flag for BOTH is the right way to go. In that
scenario, pg_upgrade would check to make sure the clusters match and
then make the appropriate suggestion. In the case someone did
something like --enable-checksums and --link, again, we'd toss an
error that --copy mode is required to --enable-checksums.

Robert Treat
https://xzilla.net

#5Peter Eisentraut
peter@eisentraut.org
In reply to: Robert Treat (#4)
1 attachment(s)
Re: pg_upgrade: Support for upgrading to checksums enabled

On 21.02.25 00:41, Robert Treat wrote:

On Tue, Aug 27, 2024 at 5:57 PM Nathan Bossart <nathandbossart@gmail.com> wrote:

On Mon, Aug 26, 2024 at 08:23:44AM +0200, Peter Eisentraut wrote:

The purpose of this patch is to allow using pg_upgrade between clusters that
have different checksum settings. When upgrading between instances with
different checksum settings, the --copy (default) mode automatically sets
(or unsets) the checksum on the fly.

This would be particularly useful if we switched to enabling checksums by
default, as [0] proposes, but it's also useful without that.

Given enabling checksums can be rather expensive, I think it makes sense to
add a way to do it during pg_upgrade versus asking folks to run
pg_checksums separately. I'd anticipate arguments against enabling
checksums automatically, but as you noted, we can move it to a separate
option (e.g., --copy --enable-checksums). Disabling checksums with
pg_checksums is fast because it just updates pg_control, so I don't see any
need for --disable-checkums in pg_upgrade.

The repercussions of either enabling or disabling checksums on
accident is quite high (not for pg_upgrade, but for $futureDBA), so
ISTM an explicit flag for BOTH is the right way to go. In that
scenario, pg_upgrade would check to make sure the clusters match and
then make the appropriate suggestion. In the case someone did
something like --enable-checksums and --link, again, we'd toss an
error that --copy mode is required to --enable-checksums.

Here is an updated patch that works more along those lines. It adds a
pg_upgrade option --update-checksums, which activates the code to
rewrite the checksums. You must specify this option if the source and
target clusters have different checksum settings.

Note that this also works to hypothetically upgrade between future
different checksum versions (hence "--update-*", not "--enable-*").
Also, as the patch is currently written, it is also required to specify
this option to downgrade from checksums to no-checksums. (It will then
write a zero into the checksum place, as it would look if you had never
used checksums.) Also, you can optionally specify this option even if
the checksum settings are the same, then it will recalculate the
checksums. Probably not all of this is useful, but perhaps a subset of
it. Thoughts?

Also, I still don't know what to do about the Windows code path in
copyFile(). We could just not support this feature on Windows? Or
maybe the notionally correct thing to do would be to move that code into
copyFileByRange(). But then we'd need a different default on Windows
and it would require more documentation. I don't know what to do here
and I don't have enough context to make a suggestion. But if we don't
answer this, I don't think we can move ahead with this feature.

Attachments:

v2-0001-pg_upgrade-Support-for-upgrading-to-checksums-ena.patchtext/plain; charset=UTF-8; name=v2-0001-pg_upgrade-Support-for-upgrading-to-checksums-ena.patchDownload
From f73db6e9bd69d0fd4fbd034a691429a22b33b472 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <peter@eisentraut.org>
Date: Mon, 10 Mar 2025 18:44:30 +0100
Subject: [PATCH v2] pg_upgrade: Support for upgrading to checksums enabled

When upgrading between instances with different checksum settings, the
--copy (default) mode automatically sets (or unsets) the checksum on
the fly.

TODO: What to do about the Windows code path?
---
 src/bin/pg_upgrade/controldata.c   | 18 ++++++-----------
 src/bin/pg_upgrade/file.c          | 31 +++++++++++++++++++++++++++++-
 src/bin/pg_upgrade/option.c        |  9 +++++++++
 src/bin/pg_upgrade/pg_upgrade.h    |  4 +++-
 src/bin/pg_upgrade/relfilenumber.c |  2 +-
 5 files changed, 49 insertions(+), 15 deletions(-)

diff --git a/src/bin/pg_upgrade/controldata.c b/src/bin/pg_upgrade/controldata.c
index bd49ea867bf..7ac85ffb4ee 100644
--- a/src/bin/pg_upgrade/controldata.c
+++ b/src/bin/pg_upgrade/controldata.c
@@ -735,18 +735,12 @@ check_control_data(ControlData *oldctrl,
 	 * check_for_isn_and_int8_passing_mismatch().
 	 */
 
-	/*
-	 * We might eventually allow upgrades from checksum to no-checksum
-	 * clusters.
-	 */
-	if (oldctrl->data_checksum_version == 0 &&
-		newctrl->data_checksum_version != 0)
-		pg_fatal("old cluster does not use data checksums but the new one does");
-	else if (oldctrl->data_checksum_version != 0 &&
-			 newctrl->data_checksum_version == 0)
-		pg_fatal("old cluster uses data checksums but the new one does not");
-	else if (oldctrl->data_checksum_version != newctrl->data_checksum_version)
-		pg_fatal("old and new cluster pg_controldata checksum versions do not match");
+	/* FIXME: what about upgrade from checksum to non-checksum? */
+	if (oldctrl->data_checksum_version != newctrl->data_checksum_version)
+	{
+		if (!user_opts.update_checksums)
+			pg_fatal("when upgrading between clusters with different data checksum settings, option %s must be used", "--update-checksums");
+	}
 }
 
 
diff --git a/src/bin/pg_upgrade/file.c b/src/bin/pg_upgrade/file.c
index 7fd1991204a..62d4af3ce45 100644
--- a/src/bin/pg_upgrade/file.c
+++ b/src/bin/pg_upgrade/file.c
@@ -80,12 +80,14 @@ cloneFile(const char *src, const char *dst,
  */
 void
 copyFile(const char *src, const char *dst,
-		 const char *schemaName, const char *relName)
+		 const char *schemaName, const char *relName,
+		 int segno)
 {
 #ifndef WIN32
 	int			src_fd;
 	int			dest_fd;
 	char	   *buffer;
+	BlockNumber blocknum;
 
 	if ((src_fd = open(src, O_RDONLY | PG_BINARY, 0)) < 0)
 		pg_fatal("error while copying relation \"%s.%s\": could not open file \"%s\": %m",
@@ -101,6 +103,8 @@ copyFile(const char *src, const char *dst,
 
 	buffer = (char *) pg_malloc(COPY_BUF_SIZE);
 
+	blocknum = segno * old_cluster.controldata.largesz;
+
 	/* perform data copying i.e read src source, write to destination */
 	while (true)
 	{
@@ -113,6 +117,31 @@ copyFile(const char *src, const char *dst,
 		if (nbytes == 0)
 			break;
 
+		/*
+		 * Update checksums if requested
+		 */
+		if (user_opts.update_checksums)
+		{
+			/* must deal in whole blocks */
+			nbytes = nbytes / BLCKSZ * BLCKSZ;
+
+			for (int i = 0; i * BLCKSZ < nbytes; i++)
+			{
+				PageData   *page = buffer + i * BLCKSZ;
+				PageHeader	header = (PageHeader) page;
+
+				if (PageIsNew(page))
+					continue;
+
+				if (new_cluster.controldata.data_checksum_version == 1)
+					header->pd_checksum = pg_checksum_page(page, blocknum);
+				else
+					header->pd_checksum = 0;
+
+				blocknum++;
+			}
+		}
+
 		errno = 0;
 		if (write(dest_fd, buffer, nbytes) != nbytes)
 		{
diff --git a/src/bin/pg_upgrade/option.c b/src/bin/pg_upgrade/option.c
index 188dd8d8a8b..a0f50c10159 100644
--- a/src/bin/pg_upgrade/option.c
+++ b/src/bin/pg_upgrade/option.c
@@ -62,6 +62,7 @@ parseCommandLine(int argc, char *argv[])
 		{"sync-method", required_argument, NULL, 4},
 		{"no-statistics", no_argument, NULL, 5},
 		{"set-char-signedness", required_argument, NULL, 6},
+		{"update-checksums", no_argument, NULL, 7},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -228,6 +229,11 @@ parseCommandLine(int argc, char *argv[])
 				else
 					pg_fatal("invalid argument for option %s", "--set-char-signedness");
 				break;
+
+			case 7:
+				user_opts.update_checksums = true;
+				break;
+
 			default:
 				fprintf(stderr, _("Try \"%s --help\" for more information.\n"),
 						os_info.progname);
@@ -241,6 +247,9 @@ parseCommandLine(int argc, char *argv[])
 	if (!user_opts.sync_method)
 		user_opts.sync_method = pg_strdup("fsync");
 
+	if (user_opts.update_checksums && user_opts.transfer_mode != TRANSFER_MODE_COPY)
+		pg_fatal("option %s requires transfer mode %s", "--update-checksums", "--copy");
+
 	if (log_opts.verbose)
 		pg_log(PG_REPORT, "Running in verbose mode");
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index f4e375d27c7..689365f4a81 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -337,6 +337,7 @@ typedef struct
 	int			char_signedness;	/* default char signedness: -1 for initial
 									 * value, 1 for "signed" and 0 for
 									 * "unsigned" */
+	bool		update_checksums;
 } UserOpts;
 
 typedef struct
@@ -414,7 +415,8 @@ bool		pid_lock_file_exists(const char *datadir);
 void		cloneFile(const char *src, const char *dst,
 					  const char *schemaName, const char *relName);
 void		copyFile(const char *src, const char *dst,
-					 const char *schemaName, const char *relName);
+					 const char *schemaName, const char *relName,
+					 int segno);
 void		copyFileByRange(const char *src, const char *dst,
 							const char *schemaName, const char *relName);
 void		linkFile(const char *src, const char *dst,
diff --git a/src/bin/pg_upgrade/relfilenumber.c b/src/bin/pg_upgrade/relfilenumber.c
index 8c23c583172..c9dc9a68ec6 100644
--- a/src/bin/pg_upgrade/relfilenumber.c
+++ b/src/bin/pg_upgrade/relfilenumber.c
@@ -248,7 +248,7 @@ transfer_relfile(FileNameMap *map, const char *type_suffix, bool vm_must_add_fro
 				case TRANSFER_MODE_COPY:
 					pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\"",
 						   old_file, new_file);
-					copyFile(old_file, new_file, map->nspname, map->relname);
+					copyFile(old_file, new_file, map->nspname, map->relname, segno);
 					break;
 				case TRANSFER_MODE_COPY_FILE_RANGE:
 					pg_log(PG_VERBOSE, "copying \"%s\" to \"%s\" with copy_file_range",
-- 
2.48.1

#6Peter Eisentraut
peter@eisentraut.org
In reply to: Peter Eisentraut (#5)
Re: pg_upgrade: Support for upgrading to checksums enabled

On 11.03.25 11:42, Peter Eisentraut wrote:

Here is an updated patch that works more along those lines.  It adds a
pg_upgrade option --update-checksums, which activates the code to
rewrite the checksums.  You must specify this option if the source and
target clusters have different checksum settings.

Note that this also works to hypothetically upgrade between future
different checksum versions (hence "--update-*", not "--enable-*").
Also, as the patch is currently written, it is also required to specify
this option to downgrade from checksums to no-checksums.  (It will then
write a zero into the checksum place, as it would look if you had never
used checksums.)  Also, you can optionally specify this option even if
the checksum settings are the same, then it will recalculate the
checksums.  Probably not all of this is useful, but perhaps a subset of
it.  Thoughts?

Also, I still don't know what to do about the Windows code path in
copyFile().  We could just not support this feature on Windows?  Or
maybe the notionally correct thing to do would be to move that code into
copyFileByRange().  But then we'd need a different default on Windows
and it would require more documentation.  I don't know what to do here
and I don't have enough context to make a suggestion.  But if we don't
answer this, I don't think we can move ahead with this feature.

I'm not sensing much enthusiasm for this feature or for working out the
remaining problems, so I'm closing this commitfest entry.

#7Robert Treat
rob@xzilla.net
In reply to: Peter Eisentraut (#6)
Re: pg_upgrade: Support for upgrading to checksums enabled

On Thu, Apr 3, 2025 at 3:25 AM Peter Eisentraut <peter@eisentraut.org> wrote:

On 11.03.25 11:42, Peter Eisentraut wrote:

Here is an updated patch that works more along those lines. It adds a
pg_upgrade option --update-checksums, which activates the code to
rewrite the checksums. You must specify this option if the source and
target clusters have different checksum settings.

Note that this also works to hypothetically upgrade between future
different checksum versions (hence "--update-*", not "--enable-*").
Also, as the patch is currently written, it is also required to specify
this option to downgrade from checksums to no-checksums. (It will then
write a zero into the checksum place, as it would look if you had never
used checksums.) Also, you can optionally specify this option even if
the checksum settings are the same, then it will recalculate the
checksums. Probably not all of this is useful, but perhaps a subset of
it. Thoughts?

Also, I still don't know what to do about the Windows code path in
copyFile(). We could just not support this feature on Windows? Or
maybe the notionally correct thing to do would be to move that code into
copyFileByRange(). But then we'd need a different default on Windows
and it would require more documentation. I don't know what to do here
and I don't have enough context to make a suggestion. But if we don't
answer this, I don't think we can move ahead with this feature.

I'm not sensing much enthusiasm for this feature or for working out the
remaining problems, so I'm closing this commitfest entry.

That's unfortunate; I think there is enthusiasm for the feature, just
not enough to overcome the questions around Windows support.

Robert Treat
https://xzilla.net