Speed up pg_checksums in cases where checksum already set

Started by Greg Sabino Mullaneover 4 years ago15 messages

htamfids@gmail.com

over 4 years ago

1 attachment(s)

The attached patch makes an optimization to pg_checksums which prevents
rewriting the block if the checksum is already what we expect. This can
lead to much faster runs in cases where it is already set (e.g. enabled ->
disabled -> enable, external helper process, interrupted runs, future
parallel processes). There is also an effort to not sync the data directory
if no changes were written. Finally, added a bit more output on how many
files were actually changed, e.g.:

Checksum operation completed
Files scanned: 1236
Blocks scanned: 23283
Files modified: 38
Blocks modified: 19194
pg_checksums: syncing data directory
pg_checksums: updating control file
Checksums enabled in cluster

Cheers,
Greg

Attachments:

pg_checksums.optimize.writes.patchapplication/octet-stream; name=pg_checksums.optimize.writes.patchDownload

diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 831cf42d3a..c24e3d4093 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -32,7 +32,9 @@
 
 
 static int64 files = 0;
+static int64 total_files_modified = 0;
 static int64 blocks = 0;
+static int64 total_blocks_modified = 0;
 static int64 badblocks = 0;
 static ControlFileData *ControlFile;
 
@@ -195,6 +197,7 @@ scan_file(const char *fn, BlockNumber segmentno)
 	int			f;
 	BlockNumber blockno;
 	int			flags;
+	int64		blocks_modified = 0;
 
 	Assert(mode == PG_MODE_ENABLE ||
 		   mode == PG_MODE_CHECK);
@@ -256,6 +259,13 @@ scan_file(const char *fn, BlockNumber segmentno)
 		{
 			int			w;
 
+			/* Do not rewrite if the checksum is already set to the expected value */
+			if (header->pd_checksum == csum) {
+				continue;
+			}
+
+			blocks_modified++;
+
 			/* Set checksum in page header */
 			header->pd_checksum = csum;
 
@@ -292,6 +302,12 @@ scan_file(const char *fn, BlockNumber segmentno)
 			pg_log_info("checksums enabled in file \"%s\"", fn);
 	}
 
+	if (blocks_modified)
+	{
+		total_files_modified++;
+		total_blocks_modified += blocks_modified;
+	}
+
 	close(f);
 }
 
@@ -637,8 +653,8 @@ main(int argc, char *argv[])
 			progress_report(true);
 
 		printf(_("Checksum operation completed\n"));
-		printf(_("Files scanned:  %s\n"), psprintf(INT64_FORMAT, files));
-		printf(_("Blocks scanned: %s\n"), psprintf(INT64_FORMAT, blocks));
+		printf(_("Files scanned:   %s\n"), psprintf(INT64_FORMAT, files));
+		printf(_("Blocks scanned:  %s\n"), psprintf(INT64_FORMAT, blocks));
 		if (mode == PG_MODE_CHECK)
 		{
 			printf(_("Bad checksums:  %s\n"), psprintf(INT64_FORMAT, badblocks));
@@ -647,6 +663,12 @@ main(int argc, char *argv[])
 			if (badblocks > 0)
 				exit(1);
 		}
+		else if (mode == PG_MODE_ENABLE)
+		{
+			printf(_("Files modified:  %s\n"), psprintf(INT64_FORMAT, total_files_modified));
+			printf(_("Blocks modified: %s\n"), psprintf(INT64_FORMAT, total_blocks_modified));
+		}
+
 	}
 
 	/*
@@ -659,7 +681,7 @@ main(int argc, char *argv[])
 		ControlFile->data_checksum_version =
 			(mode == PG_MODE_ENABLE) ? PG_DATA_CHECKSUM_VERSION : 0;
 
-		if (do_sync)
+		if (do_sync && total_files_modified)
 		{
 			pg_log_info("syncing data directory");
 			fsync_pgdata(DataDir, PG_VERSION_NUM);

Julien Rouhaud

rjuju123@gmail.com

over 4 years ago

In reply to: Greg Sabino Mullane (#1)

Re: Speed up pg_checksums in cases where checksum already set

On Thu, May 27, 2021 at 5:24 AM Greg Sabino Mullane <htamfids@gmail.com> wrote:

The attached patch makes an optimization to pg_checksums which prevents rewriting the block if the checksum is already what we expect. This can lead to much faster runs in cases where it is already set (e.g. enabled -> disabled -> enable, external helper process, interrupted runs, future parallel processes). There is also an effort to not sync the data directory if no changes were written. Finally, added a bit more output on how many files were actually changed, e.g.:

I don't know how often this will actually help as probably people
aren't toggling the checksum state that often, but it seems like a
good idea overall. The patch looks sensible to me.

Michael Paquier

michael@paquier.xyz

over 4 years ago

In reply to: Greg Sabino Mullane (#1)

Re: Speed up pg_checksums in cases where checksum already set

On Wed, May 26, 2021 at 05:23:55PM -0400, Greg Sabino Mullane wrote:

The attached patch makes an optimization to pg_checksums which prevents
rewriting the block if the checksum is already what we expect. This can
lead to much faster runs in cases where it is already set (e.g. enabled ->
disabled -> enable, external helper process, interrupted runs, future
parallel processes).

Makes sense.

There is also an effort to not sync the data directory
if no changes were written. Finally, added a bit more output on how many
files were actually changed, e.g.:

-    if (do_sync)
+    if (do_sync && total_files_modified)
     {
 	 pg_log_info("syncing data directory");
         fsync_pgdata(DataDir, PG_VERSION_NUM);

Here, I am on the edge. It could be an advantage to force a flush of
the data folder anyway, no? Say, all the pages have a correct
checksum and they are in the OS cache, but they may not have been
flushed yet. That would emulate what initdb -S does already.

Checksum operation completed
Files scanned: 1236
Blocks scanned: 23283
Files modified: 38
Blocks modified: 19194
pg_checksums: syncing data directory
pg_checksums: updating control file
Checksums enabled in cluster

The addition of the number of files modified looks like an advantage.
--
Michael

Justin Pryzby

pryzby@telsasoft.com

over 4 years ago

In reply to: Greg Sabino Mullane (#1)

Re: Speed up pg_checksums in cases where checksum already set

In one of the checksum patches, there was an understanding that the pages
should be written even if the checksum is correct, to handle replicas.

From the v19 patch:
https://www.postgresql.org/message-id/F7AFCFCD-8F77-4546-8D42-C7F675A4B680%40yesql.se
+                * Mark the buffer as dirty and force a full page write.  We have to
+                * re-write the page to WAL even if the checksum hasn't changed,
+                * because if there is a replica it might have a slightly different
+                * version of the page with an invalid checksum, caused by unlogged
+                * changes (e.g. hintbits) on the master happening while checksums
+                * were off. This can happen if there was a valid checksum on the page
+                * at one point in the past, so only when checksums are first on, then
+                * off, and then turned on again.

pg_checksums(1) says:

| When using a replication setup with tools which perform direct copies of relation file blocks (for example pg_rewind(1)), enabling or disabling checksums can lead to page
| corruptions in the shape of incorrect checksums if the operation is not done consistently across all nodes. When enabling or disabling checksums in a replication setup, it
| is thus recommended to stop all the clusters before switching them all consistently. Destroying all standbys, performing the operation on the primary and finally recreating
| the standbys from scratch is also safe.

Does your patch complicate things for the "stop all the clusters before
switching them all" case?

--
Justin

Michael Paquier

michael@paquier.xyz

over 4 years ago

In reply to: Justin Pryzby (#4)

Re: Speed up pg_checksums in cases where checksum already set

On Wed, May 26, 2021 at 09:29:43PM -0500, Justin Pryzby wrote:

In one of the checksum patches, there was an understanding that the pages
should be written even if the checksum is correct, to handle replicas.

From the v19 patch:
https://www.postgresql.org/message-id/F7AFCFCD-8F77-4546-8D42-C7F675A4B680%40yesql.se
+                * Mark the buffer as dirty and force a full page write.  We have to
+                * re-write the page to WAL even if the checksum hasn't changed,
+                * because if there is a replica it might have a slightly different
+                * version of the page with an invalid checksum, caused by unlogged
+                * changes (e.g. hintbits) on the master happening while checksums
+                * were off. This can happen if there was a valid checksum on the page
+                * at one point in the past, so only when checksums are first on, then
+                * off, and then turned on again.

I am not really following the line of argument here. pg_checksums
relies on the fact that the cluster has been safely shut down before
running. So, if this comes to standbys, they would have reached a
consistent point, and the shutdown makes sure that all pages are
flushed.
--
Michael

Greg Sabino Mullane

htamfids@gmail.com

over 4 years ago

In reply to: Michael Paquier (#3)

Re: Speed up pg_checksums in cases where checksum already set

Thanks for the quick replies, everyone.

On Wed, May 26, 2021 at 10:17 PM Michael Paquier <michael@paquier.xyz>
wrote:

-    if (do_sync)
+    if (do_sync && total_files_modified)
Here, I am on the edge.  It could be an advantage to force a flush of
the data folder anyway, no?

I was originally on the fence about including this as well, but it seems
like since the database is shut down and already in a consistent state,
there seems no advantage to syncing if we have not made any changes. Things
are no better or worse than when we arrived. However, the real-world use
case of running pg_checksums --enable and getting no changed blocks is
probably fairly rare, so if there is a strong objection, I'm happy
reverting to just (do_sync). (I'm not sure how cheap a sync is, I assume
it's low impact as the database is shut down, I guess it becomes a "might
as well while we are here"?)

On Wed, May 26, 2021 at 10:29 PM Justin Pryzby <pryzby@telsasoft.com> wrote:

In one of the checksum patches, there was an understanding that the pages
should be written even if the checksum is correct, to handle replicas.
...
Does your patch complicate things for the "stop all the clusters before
switching them all" case?

I cannot imagine how it would, but, like Michael, I'm not really
understanding the reasoning here. We only run when safely shutdown, so no
WAL or dirty buffers need concern us :). Of course, once the postmaster is
up and running, fiddling with checksums becomes vastly more complicated, as
evidenced by that thread. I'm happy sticking to and speeding up the offline
version for now.

Cheers,
Greg

Michael Paquier

michael@paquier.xyz

over 4 years ago

In reply to: Greg Sabino Mullane (#6)

Re: Speed up pg_checksums in cases where checksum already set

On Thu, May 27, 2021 at 10:29:14AM -0400, Greg Sabino Mullane wrote:

I was originally on the fence about including this as well, but it seems
like since the database is shut down and already in a consistent state,
there seems no advantage to syncing if we have not made any changes. Things
are no better or worse than when we arrived. However, the real-world use
case of running pg_checksums --enable and getting no changed blocks is
probably fairly rare, so if there is a strong objection, I'm happy
reverting to just (do_sync). (I'm not sure how cheap a sync is, I assume
it's low impact as the database is shut down, I guess it becomes a "might
as well while we are here"?)

I understand that this should be rare, but I don't want to take any
bets either. With this patch, we could finish with cases where some
pages are still in the OS cache but don't get flushed because a
previous cancellation let the cluster in a state where all the page
checksums have been written out but a portion of the files were not
synced. A follow-up run of pg_checksums would see that all the pages
are correct, but would think that no sync is required, incorrectly.
--
Michael

Greg Sabino Mullane

htamfids@gmail.com

over 4 years ago

In reply to: Michael Paquier (#7)

1 attachment(s)

Re: Speed up pg_checksums in cases where checksum already set

Fair enough; thanks for the feedback. Attached is a new version that does
an unconditional sync (well, unless do_sync is false, a flag I am not
particularly fond of).

Cheers,
Greg

Attachments:

pg_checksums.optimize.writes.always.sync.patchapplication/octet-stream; name=pg_checksums.optimize.writes.always.sync.patchDownload

commit af64123688e8999362a578735f5b43b76a1bbc05
Author: Greg Sabino Mullane <greg@turnstep.com>
Date:   Wed Jun 2 10:14:57 2021 -0400

    Skip rewriting the file if checksums already match.

diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 831cf42d3a..afd3e598eb 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -32,7 +32,9 @@
 
 
 static int64 files = 0;
+static int64 total_files_modified = 0;
 static int64 blocks = 0;
+static int64 total_blocks_modified = 0;
 static int64 badblocks = 0;
 static ControlFileData *ControlFile;
 
@@ -195,6 +197,7 @@ scan_file(const char *fn, BlockNumber segmentno)
 	int			f;
 	BlockNumber blockno;
 	int			flags;
+	int64		blocks_modified = 0;
 
 	Assert(mode == PG_MODE_ENABLE ||
 		   mode == PG_MODE_CHECK);
@@ -256,6 +259,13 @@ scan_file(const char *fn, BlockNumber segmentno)
 		{
 			int			w;
 
+			/* Do not rewrite if the checksum is already set to the expected value */
+			if (header->pd_checksum == csum) {
+				continue;
+			}
+
+			blocks_modified++;
+
 			/* Set checksum in page header */
 			header->pd_checksum = csum;
 
@@ -292,6 +302,12 @@ scan_file(const char *fn, BlockNumber segmentno)
 			pg_log_info("checksums enabled in file \"%s\"", fn);
 	}
 
+	if (blocks_modified)
+	{
+		total_files_modified++;
+		total_blocks_modified += blocks_modified;
+	}
+
 	close(f);
 }
 
@@ -637,8 +653,8 @@ main(int argc, char *argv[])
 			progress_report(true);
 
 		printf(_("Checksum operation completed\n"));
-		printf(_("Files scanned:  %s\n"), psprintf(INT64_FORMAT, files));
-		printf(_("Blocks scanned: %s\n"), psprintf(INT64_FORMAT, blocks));
+		printf(_("Files scanned:   %s\n"), psprintf(INT64_FORMAT, files));
+		printf(_("Blocks scanned:  %s\n"), psprintf(INT64_FORMAT, blocks));
 		if (mode == PG_MODE_CHECK)
 		{
 			printf(_("Bad checksums:  %s\n"), psprintf(INT64_FORMAT, badblocks));
@@ -647,6 +663,12 @@ main(int argc, char *argv[])
 			if (badblocks > 0)
 				exit(1);
 		}
+		else if (mode == PG_MODE_ENABLE)
+		{
+			printf(_("Files modified:  %s\n"), psprintf(INT64_FORMAT, total_files_modified));
+			printf(_("Blocks modified: %s\n"), psprintf(INT64_FORMAT, total_blocks_modified));
+		}
+
 	}
 
 	/*

Greg Sabino Mullane

htamfids@gmail.com

over 4 years ago

In reply to: Greg Sabino Mullane (#8)

1 attachment(s)

Re: Speed up pg_checksums in cases where checksum already set

Newer version attach that adds a small documentation tweak as well.

Cheers,
Greg

Attachments:

pg_checksums.optimize.writes.and.always.sync.patchapplication/octet-stream; name=pg_checksums.optimize.writes.and.always.sync.patchDownload

commit 086b33ab9858ed658d001744c0d501257c9fcd7c
Author: Greg Sabino Mullane <greg@turnstep.com>
Date:   Wed Jun 2 10:14:57 2021 -0400

    Skip rewriting the file if the checksums already match.

diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index c84bc5c5b2..9a97a5e92f 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -47,7 +47,8 @@ PostgreSQL documentation
 
   <para>
    When verifying checksums, every file in the cluster is scanned. When
-   enabling checksums, every file in the cluster is rewritten in-place.
+   enabling checksums, every file in the cluster with a changed checksum is 
+   rewritten in-place.
    Disabling checksums only updates the file <filename>pg_control</filename>.
   </para>
  </refsect1>
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 831cf42d3a..afd3e598eb 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -32,7 +32,9 @@
 
 
 static int64 files = 0;
+static int64 total_files_modified = 0;
 static int64 blocks = 0;
+static int64 total_blocks_modified = 0;
 static int64 badblocks = 0;
 static ControlFileData *ControlFile;
 
@@ -195,6 +197,7 @@ scan_file(const char *fn, BlockNumber segmentno)
 	int			f;
 	BlockNumber blockno;
 	int			flags;
+	int64		blocks_modified = 0;
 
 	Assert(mode == PG_MODE_ENABLE ||
 		   mode == PG_MODE_CHECK);
@@ -256,6 +259,13 @@ scan_file(const char *fn, BlockNumber segmentno)
 		{
 			int			w;
 
+			/* Do not rewrite if the checksum is already set to the expected value */
+			if (header->pd_checksum == csum) {
+				continue;
+			}
+
+			blocks_modified++;
+
 			/* Set checksum in page header */
 			header->pd_checksum = csum;
 
@@ -292,6 +302,12 @@ scan_file(const char *fn, BlockNumber segmentno)
 			pg_log_info("checksums enabled in file \"%s\"", fn);
 	}
 
+	if (blocks_modified)
+	{
+		total_files_modified++;
+		total_blocks_modified += blocks_modified;
+	}
+
 	close(f);
 }
 
@@ -637,8 +653,8 @@ main(int argc, char *argv[])
 			progress_report(true);
 
 		printf(_("Checksum operation completed\n"));
-		printf(_("Files scanned:  %s\n"), psprintf(INT64_FORMAT, files));
-		printf(_("Blocks scanned: %s\n"), psprintf(INT64_FORMAT, blocks));
+		printf(_("Files scanned:   %s\n"), psprintf(INT64_FORMAT, files));
+		printf(_("Blocks scanned:  %s\n"), psprintf(INT64_FORMAT, blocks));
 		if (mode == PG_MODE_CHECK)
 		{
 			printf(_("Bad checksums:  %s\n"), psprintf(INT64_FORMAT, badblocks));
@@ -647,6 +663,12 @@ main(int argc, char *argv[])
 			if (badblocks > 0)
 				exit(1);
 		}
+		else if (mode == PG_MODE_ENABLE)
+		{
+			printf(_("Files modified:  %s\n"), psprintf(INT64_FORMAT, total_files_modified));
+			printf(_("Blocks modified: %s\n"), psprintf(INT64_FORMAT, total_blocks_modified));
+		}
+
 	}
 
 	/*

#10

Michael Paquier

michael@paquier.xyz

over 4 years ago

In reply to: Greg Sabino Mullane (#9)

Re: Speed up pg_checksums in cases where checksum already set

On Wed, Jun 02, 2021 at 05:09:36PM -0400, Greg Sabino Mullane wrote:

Newer version attach that adds a small documentation tweak as well.

-   enabling checksums, every file in the cluster is rewritten in-place.
+   enabling checksums, every file in the cluster with a changed checksum is
+   rewritten in-place.

This doc addition is a bit confusing, as it could mean that each file
has just one single checksum. We could be more precise, say:
"When enabling checksums, each relation file block with a changed
checksum is rewritten in place."

Should we also mention that the sync happens even if no blocks are
rewritten based on the reasoning of upthread (aka we'd better do the
final flush as an interrupted pg_checksums may let a portion of the
files as not flushed)?
--
Michael

#11

Greg Sabino Mullane

htamfids@gmail.com

over 4 years ago

In reply to: Michael Paquier (#10)

1 attachment(s)

Re: Speed up pg_checksums in cases where checksum already set

On Fri, Jun 18, 2021 at 1:57 AM Michael Paquier <michael@paquier.xyz> wrote:

This doc addition is a bit confusing, as it could mean that each file
has just one single checksum. We could be more precise, say:
"When enabling checksums, each relation file block with a changed
checksum is rewritten in place."

Agreed, I like that wording. New patch attached.

Should we also mention that the sync happens even if no blocks are
rewritten based on the reasoning of upthread (aka we'd better do the
final flush as an interrupted pg_checksums may let a portion of the
files as not flushed)?

I don't know that we need to bother: the default is already to sync and one
has to go out of one's way using the -N argument to NOT sync, so I think
it's a pretty safe assumption to everyone (except those who read my first
version of my patch!) that syncing always happens.

Cheers,
Greg

Attachments:

003.pg_checksums.optimize.writes.and.always.sync.patchapplication/octet-stream; name=003.pg_checksums.optimize.writes.and.always.sync.patchDownload

commit 9f0f0a5e5e300a07f93bdecd3c9ba99c27d73675
Author: Greg Sabino Mullane <greg@turnstep.com>
Date:   Fri Jun 18 19:55:44 2021 -0400

    pg_checksums: skip rewriting the file if the checksums already match

diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index c84bc5c5b2..6a6aebed51 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -47,7 +47,8 @@ PostgreSQL documentation
 
   <para>
    When verifying checksums, every file in the cluster is scanned. When
-   enabling checksums, every file in the cluster is rewritten in-place.
+   enabling checksums, each relation file block with a changed checksum is 
+   rewritten in-place.
    Disabling checksums only updates the file <filename>pg_control</filename>.
   </para>
  </refsect1>
diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 831cf42d3a..afd3e598eb 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -32,7 +32,9 @@
 
 
 static int64 files = 0;
+static int64 total_files_modified = 0;
 static int64 blocks = 0;
+static int64 total_blocks_modified = 0;
 static int64 badblocks = 0;
 static ControlFileData *ControlFile;
 
@@ -195,6 +197,7 @@ scan_file(const char *fn, BlockNumber segmentno)
 	int			f;
 	BlockNumber blockno;
 	int			flags;
+	int64		blocks_modified = 0;
 
 	Assert(mode == PG_MODE_ENABLE ||
 		   mode == PG_MODE_CHECK);
@@ -256,6 +259,13 @@ scan_file(const char *fn, BlockNumber segmentno)
 		{
 			int			w;
 
+			/* Do not rewrite if the checksum is already set to the expected value */
+			if (header->pd_checksum == csum) {
+				continue;
+			}
+
+			blocks_modified++;
+
 			/* Set checksum in page header */
 			header->pd_checksum = csum;
 
@@ -292,6 +302,12 @@ scan_file(const char *fn, BlockNumber segmentno)
 			pg_log_info("checksums enabled in file \"%s\"", fn);
 	}
 
+	if (blocks_modified)
+	{
+		total_files_modified++;
+		total_blocks_modified += blocks_modified;
+	}
+
 	close(f);
 }
 
@@ -637,8 +653,8 @@ main(int argc, char *argv[])
 			progress_report(true);
 
 		printf(_("Checksum operation completed\n"));
-		printf(_("Files scanned:  %s\n"), psprintf(INT64_FORMAT, files));
-		printf(_("Blocks scanned: %s\n"), psprintf(INT64_FORMAT, blocks));
+		printf(_("Files scanned:   %s\n"), psprintf(INT64_FORMAT, files));
+		printf(_("Blocks scanned:  %s\n"), psprintf(INT64_FORMAT, blocks));
 		if (mode == PG_MODE_CHECK)
 		{
 			printf(_("Bad checksums:  %s\n"), psprintf(INT64_FORMAT, badblocks));
@@ -647,6 +663,12 @@ main(int argc, char *argv[])
 			if (badblocks > 0)
 				exit(1);
 		}
+		else if (mode == PG_MODE_ENABLE)
+		{
+			printf(_("Files modified:  %s\n"), psprintf(INT64_FORMAT, total_files_modified));
+			printf(_("Blocks modified: %s\n"), psprintf(INT64_FORMAT, total_blocks_modified));
+		}
+
 	}
 
 	/*

#12

Michael Paquier

michael@paquier.xyz

over 4 years ago

In reply to: Greg Sabino Mullane (#11)

Re: Speed up pg_checksums in cases where checksum already set

On Fri, Jun 18, 2021 at 08:01:17PM -0400, Greg Sabino Mullane wrote:

I don't know that we need to bother: the default is already to sync and one
has to go out of one's way using the -N argument to NOT sync, so I think
it's a pretty safe assumption to everyone (except those who read my first
version of my patch!) that syncing always happens.

Perhaps you are right to keep it simple. If people would like to
document that more precisely, it could always be changed if
necessary. What you have here sounds pretty much right to me.
--
Michael

#13

Michael Paquier

michael@paquier.xyz

over 4 years ago

In reply to: Michael Paquier (#12)

1 attachment(s)

Re: Speed up pg_checksums in cases where checksum already set

On Wed, Jun 23, 2021 at 09:39:37AM +0900, Michael Paquier wrote:

Perhaps you are right to keep it simple. If people would like to
document that more precisely, it could always be changed if
necessary. What you have here sounds pretty much right to me.

So, I was looking at this one today, and got confused with the name of
the counters once the patch was in place as this leads to having
things like "blocks" and "total_blocks_modified", which is a bit
confusing as "blocks" stands for the number of blocks scanned,
including new pages. I have simply suffixed "files" and "blocks" with
"_scanned" to be more consistent with the new counters that are named
"_written", giving the attached. We still need to have the per-file
counter in scan_file() with the global counters updated at the end of
a file scan for the sake of the file counter, of course.

Does that look fine to you?
--
Michael

Attachments:

004.pg_checksums.optimize.writes.and.always.sync.patchtext/x-diff; charset=us-asciiDownload

diff --git a/src/bin/pg_checksums/pg_checksums.c b/src/bin/pg_checksums/pg_checksums.c
index 831cf42d3a..a13cf22b57 100644
--- a/src/bin/pg_checksums/pg_checksums.c
+++ b/src/bin/pg_checksums/pg_checksums.c
@@ -31,8 +31,10 @@
 #include "storage/checksum_impl.h"
 
 
-static int64 files = 0;
-static int64 blocks = 0;
+static int64 files_scanned = 0;
+static int64 files_written = 0;
+static int64 blocks_scanned = 0;
+static int64 blocks_written = 0;
 static int64 badblocks = 0;
 static ControlFileData *ControlFile;
 
@@ -195,6 +197,7 @@ scan_file(const char *fn, BlockNumber segmentno)
 	int			f;
 	BlockNumber blockno;
 	int			flags;
+	int64		blocks_written_in_file = 0;
 
 	Assert(mode == PG_MODE_ENABLE ||
 		   mode == PG_MODE_CHECK);
@@ -208,7 +211,7 @@ scan_file(const char *fn, BlockNumber segmentno)
 		exit(1);
 	}
 
-	files++;
+	files_scanned++;
 
 	for (blockno = 0;; blockno++)
 	{
@@ -227,7 +230,7 @@ scan_file(const char *fn, BlockNumber segmentno)
 							 blockno, fn, r, BLCKSZ);
 			exit(1);
 		}
-		blocks++;
+		blocks_scanned++;
 
 		/*
 		 * Since the file size is counted as total_size for progress status
@@ -256,6 +259,12 @@ scan_file(const char *fn, BlockNumber segmentno)
 		{
 			int			w;
 
+			/* Do not rewrite if the checksum is already set to the expected value */
+			if (header->pd_checksum == csum)
+				continue;
+
+			blocks_written_in_file++;
+
 			/* Set checksum in page header */
 			header->pd_checksum = csum;
 
@@ -292,6 +301,13 @@ scan_file(const char *fn, BlockNumber segmentno)
 			pg_log_info("checksums enabled in file \"%s\"", fn);
 	}
 
+	/* Update write counters if any write activity has happened */
+	if (blocks_written_in_file > 0)
+	{
+		files_written++;
+		blocks_written += blocks_written_in_file;
+	}
+
 	close(f);
 }
 
@@ -637,8 +653,8 @@ main(int argc, char *argv[])
 			progress_report(true);
 
 		printf(_("Checksum operation completed\n"));
-		printf(_("Files scanned:  %s\n"), psprintf(INT64_FORMAT, files));
-		printf(_("Blocks scanned: %s\n"), psprintf(INT64_FORMAT, blocks));
+		printf(_("Files scanned:   %s\n"), psprintf(INT64_FORMAT, files_scanned));
+		printf(_("Blocks scanned:  %s\n"), psprintf(INT64_FORMAT, blocks_scanned));
 		if (mode == PG_MODE_CHECK)
 		{
 			printf(_("Bad checksums:  %s\n"), psprintf(INT64_FORMAT, badblocks));
@@ -647,6 +663,11 @@ main(int argc, char *argv[])
 			if (badblocks > 0)
 				exit(1);
 		}
+		else if (mode == PG_MODE_ENABLE)
+		{
+			printf(_("Files written:  %s\n"), psprintf(INT64_FORMAT, files_written));
+			printf(_("Blocks written: %s\n"), psprintf(INT64_FORMAT, blocks_written));
+		}
 	}
 
 	/*
diff --git a/doc/src/sgml/ref/pg_checksums.sgml b/doc/src/sgml/ref/pg_checksums.sgml
index c84bc5c5b2..6a6aebed51 100644
--- a/doc/src/sgml/ref/pg_checksums.sgml
+++ b/doc/src/sgml/ref/pg_checksums.sgml
@@ -47,7 +47,8 @@ PostgreSQL documentation
 
   <para>
    When verifying checksums, every file in the cluster is scanned. When
-   enabling checksums, every file in the cluster is rewritten in-place.
+   enabling checksums, each relation file block with a changed checksum is 
+   rewritten in-place.
    Disabling checksums only updates the file <filename>pg_control</filename>.
   </para>
  </refsect1>

#14

Greg Sabino Mullane

htamfids@gmail.com

over 4 years ago

In reply to: Michael Paquier (#13)

Re: Speed up pg_checksums in cases where checksum already set

On Tue, Jun 29, 2021 at 2:59 AM Michael Paquier <michael@paquier.xyz> wrote: